Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Creating forward values of a variables based on year of birth.

    Dear Statalist Users,

    I am working with DHS data to explore the effect of refugee exposure in their country on their health outcomes. CASEID defines children born between 2003 and 2015. Plate is their city of residence at the time of survey. Refpop variable is my treatment variable. I would like to create forward values of my treatment variable to test the common trend assumption/pre-existing trends. So, for example, if the value of refpop variable for children born in 2003 is 0, I want my refpop(t+1) variable to be 0 for children born in 2004. Or alternatively, if the value of refpop variable for children born in 2003 is 0, I want my refpop(t+2) variable to be 0 for children born in 2005. Even though it is not shown the data I extracted here, refpop variable for children born in 2012 is missing. And it takes 0 for children born between 2003 and 2011 because there were no any refugees in these years in the country of interest. The actual values of refpop variable starts in 2013. So it takes actual values for children born in and after 2013. Therefore, I would like to create forward values of my treatment variable (refpop) in t+8. However, I could not do it using the following code gen refpop_forward1= refpop[_n+1] (this is an example for t+1) as it does not do the thing I want.

    If anyone can help me I will be really happy.

    Thank you so much.





    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str15 CASEID float(plate child_birth_year) double refpop
    "       60521  2" 35 2003 0
    "      3416 5  2" 27 2003 0
    "      341211  2" 27 2003 0
    "      1811 8  2" 42 2003 0
    "      331713  4" 65 2003 0
    "      203414  2" 46 2003 0
    "      300213  4"  4 2003 0
    "      2123 7  2" 31 2003 0
    "      2701 5  2"  5 2003 0
    "      3619 2  2" 56 2003 0
    "       50210  2" 35 2003 0
    "       13018  2" 34 2003 0
    "      221022  2" 38 2003 0
    "      130313  2" 26 2003 0
    "       13719  2" 34 2003 0
    "       137 6  2" 34 2003 0
    "      290813  9" 52 2003 0
    "      1601 3  2" 42 2003 0
    "      1903 2  2"  1 2003 0
    "      1014 2  4" 16 2003 0
    "      250214  2" 37 2003 0
    "      310910  2" 25 2003 0
    "      2008 9  2"  7 2003 0
    "      201614  2" 31 2003 0
    "       102 3  2" 34 2003 0
    "       60819  2" 45 2003 0
    "      3213 2  2" 44 2003 0
    "      2006 9  2"  7 2003 0
    "      220523  2" 38 2003 0
    "      222417  2" 68 2003 0
    "      100619  2" 16 2003 0
    "      202825  2" 33 2003 0
    "      2603 7  2" 19 2003 0
    "       13812  2" 34 2003 0
    "       14621  2" 34 2003 0
    "      151110  2"  6 2003 0
    "      190414  2"  1 2003 0
    "      360215  5"  2 2004 0
    "      2024 2  2" 33 2004 0
    "      300321  2"  4 2004 0
    "      2139 3  2" 80 2004 0
    "      3017 3  2" 25 2004 0
    "      150213  2"  6 2004 0
    "      350921  2" 27 2004 0
    "       51414  2" 35 2004 0
    "      200513  2"  7 2004 0
    "      2131 7  2" 33 2004 0
    "      130113  2" 11 2004 0
    "      3016 8  2" 25 2004 0
    "      2604 4  2" 19 2004 0
    "       323 8  2" 59 2004 0
    "       50716  2" 35 2004 0
    "      3410 1  1" 27 2004 0
    "      341610  4" 27 2004 0
    "      161022  4" 42 2004 0
    "      221613  2" 58 2004 0
    "       60122  2"  9 2004 0
    "      321118  2" 44 2004 0
    "      201522  2" 31 2004 0
    "      300210  5"  4 2004 0
    "      111110  2" 54 2004 0
    "      3413 3  2" 27 2004 0
    "       10321  2" 34 2004 0
    "      341319  1" 27 2004 0
    "      300320  1"  4 2004 0
    "      241110  2" 55 2004 0
    "      3101 2  3"  4 2004 0
    "      101522  2" 16 2004 0
    "      3507 1  1" 21 2004 0
    "      3207 7  2" 23 2004 0
    "      3116 1  2" 36 2004 0
    "      230515  4" 50 2004 0
    "      212615  2" 32 2004 0
    "      341010  2" 27 2004 0
    "      2210 3  2" 38 2004 0
    "      311614  4" 36 2004 0
    "      282723  2" 61 2004 0
    "      281816  2" 53 2004 0
    "       115 3  2" 34 2004 0
    "      140314  2" 26 2004 0
    "      191118  4"  1 2004 0
    "       12710  2" 34 2004 0
    "      240314  2" 55 2004 0
    "      3108 2  2" 25 2004 0
    "      100617  2" 16 2004 0
    "      2409 4  2" 55 2004 0
    "      1302 2  2" 14 2004 0
    "      3202 2  2" 13 2004 0
    "      3103 5  4"  4 2004 0
    "      320818  2" 23 2004 0
    "      2513 6  2" 67 2004 0
    "      3403 5  2" 27 2004 0
    "      1911 6  2"  1 2004 0
    "      311110  1" 25 2004 0
    "      2416 5  2" 55 2004 0
    "      260317  2" 19 2004 0
    "      160615  4" 42 2004 0
    "      351914  2" 72 2004 0
    "       51112  2" 35 2004 0
    "      200424  2"  7 2004 0
    end

  • #2
    It's not very clear in the question, but I assumed that the refpop indicator is the same within each year (aka, if 2003 is 0, all 2003 is 0; if 2015 is 1, all 2015 is 1). If that's correct, I'd first create a year-level refpop data, and then create the lagged variables there, follow by a merge. Here are some sample codes, I created my own data set that spans across the range just to be sure:

    Code:
    clear
    input caseid child_birth_year refpop
    1  2003 0
    2  2004 0
    3  2005 0
    4  2006 0
    5  2007 0
    6  2008 0
    7  2009 0
    8  2010 0
    9  2011 0
    10 2013 1
    11 2014 1
    12 2015 1
    end
    
    * Collapse into annual summary
    collapse (mean) refpop, by(child_birth_year)
    
    * Add a missing row for year 2012
    set obs 13
    replace child_birth_year = 2012 in 13
    gsort child_birth_year
    
    * Create the lagged refpop indicator by 8 years
    forvalues x = 1/8{
    gen refpop_`x' = refpop[_n-`x'] if child_birth_year - child_birth_year[_n-`x'] == `x'
    }
    
    * Result
    list, sep(0)
    Here is what the crosswalk file looks like:

    Code:
         +-----------------------------------------------------------------------------------------------------------+
         | child_~r   refpop   refpop_1   refpop_2   refpop_3   refpop_4   refpop_5   refpop_6   refpop_7   refpop_8 |
         |-----------------------------------------------------------------------------------------------------------|
      1. |     2003        0          .          .          .          .          .          .          .          . |
      2. |     2004        0          0          .          .          .          .          .          .          . |
      3. |     2005        0          0          0          .          .          .          .          .          . |
      4. |     2006        0          0          0          0          .          .          .          .          . |
      5. |     2007        0          0          0          0          0          .          .          .          . |
      6. |     2008        0          0          0          0          0          0          .          .          . |
      7. |     2009        0          0          0          0          0          0          0          .          . |
      8. |     2010        0          0          0          0          0          0          0          0          . |
      9. |     2011        0          0          0          0          0          0          0          0          0 |
     10. |     2012        .          0          0          0          0          0          0          0          0 |
     11. |     2013        1          .          0          0          0          0          0          0          0 |
     12. |     2014        1          1          .          0          0          0          0          0          0 |
     13. |     2015        1          1          1          .          0          0          0          0          0 |
         +-----------------------------------------------------------------------------------------------------------+
    Save a copy of this file somewhere (let's call that to_merge).

    Open the original data, and the merge them back using

    Code:
    merge m:1 child_birth_year using to_merge

    Comment


    • #3
      Dear Professor,

      You understood it correctly. Thank you so much. One thing that confuses my mind is that, each children have different values of refpop variable (after 2013) depending upon their city of residence and their birth year. Therefore, I am not sure whether this code takes this into account. I am a bit confused. Here is an example;
      Click image for larger version

Name:	statalist.png
Views:	1
Size:	281.8 KB
ID:	1665651


      Originally posted by Ken Chui View Post
      It's not very clear in the question, but I assumed that the refpop indicator is the same within each year (aka, if 2003 is 0, all 2003 is 0; if 2015 is 1, all 2015 is 1). If that's correct, I'd first create a year-level refpop data, and then create the lagged variables there, follow by a merge. Here are some sample codes, I created my own data set that spans across the range just to be sure:

      Code:
      clear
      input caseid child_birth_year refpop
      1 2003 0
      2 2004 0
      3 2005 0
      4 2006 0
      5 2007 0
      6 2008 0
      7 2009 0
      8 2010 0
      9 2011 0
      10 2013 1
      11 2014 1
      12 2015 1
      end
      
      * Collapse into annual summary
      collapse (mean) refpop, by(child_birth_year)
      
      * Add a missing row for year 2012
      set obs 13
      replace child_birth_year = 2012 in 13
      gsort child_birth_year
      
      * Create the lagged refpop indicator by 8 years
      forvalues x = 1/8{
      gen refpop_`x' = refpop[_n-`x'] if child_birth_year - child_birth_year[_n-`x'] == `x'
      }
      
      * Result
      list, sep(0)
      Here is what the crosswalk file looks like:

      Code:
      +-----------------------------------------------------------------------------------------------------------+
      | child_~r refpop refpop_1 refpop_2 refpop_3 refpop_4 refpop_5 refpop_6 refpop_7 refpop_8 |
      |-----------------------------------------------------------------------------------------------------------|
      1. | 2003 0 . . . . . . . . |
      2. | 2004 0 0 . . . . . . . |
      3. | 2005 0 0 0 . . . . . . |
      4. | 2006 0 0 0 0 . . . . . |
      5. | 2007 0 0 0 0 0 . . . . |
      6. | 2008 0 0 0 0 0 0 . . . |
      7. | 2009 0 0 0 0 0 0 0 . . |
      8. | 2010 0 0 0 0 0 0 0 0 . |
      9. | 2011 0 0 0 0 0 0 0 0 0 |
      10. | 2012 . 0 0 0 0 0 0 0 0 |
      11. | 2013 1 . 0 0 0 0 0 0 0 |
      12. | 2014 1 1 . 0 0 0 0 0 0 |
      13. | 2015 1 1 1 . 0 0 0 0 0 |
      +-----------------------------------------------------------------------------------------------------------+
      Save a copy of this file somewhere (let's call that to_merge).

      Open the original data, and the merge them back using

      Code:
      merge m:1 child_birth_year using to_merge
      Dea

      Comment


      • #4
        Originally posted by Cansu Oymak View Post
        Dear Professor,

        You understood it correctly. Thank you so much. One thing that confuses my mind is that, each children have different values of refpop variable (after 2013) depending upon their city of residence and their birth year. Therefore, I am not sure whether this code takes this into account. I am a bit confused.

        Dea
        Then it'd be appreciated if this important feature had been brought up at the beginning. That would have saved some time coming up with a useless solution. Since I don't know the data completion status for each plate (city), here is a relative more flexible way that does not rely on row numbers:

        Code:
        clear
        input child_birth_year refpop plate
        2003 0   1
        2004 0   1
        2005 0   1
        2006 0   1
        2007 0   1
        2008 0   1
        2009 0   1
        2010 0   1
        2011 0   1
        2013 0.2 1
        2014 0.4 1
        2015 0.6 1
        2003 0   2
        2004 0   2
        2005 0   2
        2006 0   2
        2007 0   2
        2008 0   2
        2009 0   2
        2010 0   2
        2011 0   2
        2013 0.5 2
        2014 0.7 2
        2015 0.9 2
        end
        
        * Collapse into annual summary
        collapse (mean) refpop, by(child_birth_year plate)
        save crosswalk, replace
        
        forvalue x = 1/8{
            use crosswalk, clear
            drop if child_birth_year == 2016 - `x'
            replace child_birth_year = child_birth_year + `x'
            drop if refpop == .
            drop if !inrange(child_birth_year, 2003, 2015)
            rename refpop refpop_`x'
            save t_`x', replace
            
            use crosswalk, clear
            merge m:1 child_birth_year plate using t_`x'
            drop _merge
            capture erase t_`x'
            save crosswalk, replace
        }
        
        gsort plate child_birth_year
        
        * Result
        list, sep(0)
        Results:
        Code:
             +-------------------------------------------------------------------------------------------------------------------+
             | child_~r   plate   refpop   refpop_1   refpop_2   refpop_3   refpop_4   refpop_5   refpop_6   refpop_7   refpop_8 |
             |-------------------------------------------------------------------------------------------------------------------|
          1. |     2003       1        0          .          .          .          .          .          .          .          . |
          2. |     2004       1        0          0          .          .          .          .          .          .          . |
          3. |     2005       1        0          0          0          .          .          .          .          .          . |
          4. |     2006       1        0          0          0          0          .          .          .          .          . |
          5. |     2007       1        0          0          0          0          0          .          .          .          . |
          6. |     2008       1        0          0          0          0          0          0          .          .          . |
          7. |     2009       1        0          0          0          0          0          0          0          .          . |
          8. |     2010       1        0          0          0          0          0          0          0          0          . |
          9. |     2011       1        0          0          0          0          0          0          0          0          0 |
         10. |     2012       1        .          0          0          0          0          0          0          0          0 |
         11. |     2013       1       .2          .          0          0          0          0          0          0          0 |
         12. |     2014       1       .4         .2          .          0          0          0          0          0          0 |
         13. |     2015       1       .6         .4         .2          .          0          0          0          0          0 |
         14. |     2003       2        0          .          .          .          .          .          .          .          . |
         15. |     2004       2        0          0          .          .          .          .          .          .          . |
         16. |     2005       2        0          0          0          .          .          .          .          .          . |
         17. |     2006       2        0          0          0          0          .          .          .          .          . |
         18. |     2007       2        0          0          0          0          0          .          .          .          . |
         19. |     2008       2        0          0          0          0          0          0          .          .          . |
         20. |     2009       2        0          0          0          0          0          0          0          .          . |
         21. |     2010       2        0          0          0          0          0          0          0          0          . |
         22. |     2011       2        0          0          0          0          0          0          0          0          0 |
         23. |     2012       2        .          0          0          0          0          0          0          0          0 |
         24. |     2013       2       .5          .          0          0          0          0          0          0          0 |
         25. |     2014       2       .7         .5          .          0          0          0          0          0          0 |
         26. |     2015       2       .9         .7         .5          .          0          0          0          0          0 |
             +-------------------------------------------------------------------------------------------------------------------+

        Comment


        • #5
          Dear Professor,

          I apologize for the misunderstanding at first.
          I have tried and it worked. Thank you very much for your time and effort. I will use these forward values to test the pre-existing trends. Thank you!
          Originally posted by Ken Chui View Post

          Then it'd be appreciated if this important feature had been brought up at the beginning. That would have saved some time coming up with a useless solution. Since I don't know the data completion status for each plate (city), here is a relative more flexible way that does not rely on row numbers:

          Code:
          clear
          input child_birth_year refpop plate
          2003 0 1
          2004 0 1
          2005 0 1
          2006 0 1
          2007 0 1
          2008 0 1
          2009 0 1
          2010 0 1
          2011 0 1
          2013 0.2 1
          2014 0.4 1
          2015 0.6 1
          2003 0 2
          2004 0 2
          2005 0 2
          2006 0 2
          2007 0 2
          2008 0 2
          2009 0 2
          2010 0 2
          2011 0 2
          2013 0.5 2
          2014 0.7 2
          2015 0.9 2
          end
          
          * Collapse into annual summary
          collapse (mean) refpop, by(child_birth_year plate)
          save crosswalk, replace
          
          forvalue x = 1/8{
          use crosswalk, clear
          drop if child_birth_year == 2016 - `x'
          replace child_birth_year = child_birth_year + `x'
          drop if refpop == .
          drop if !inrange(child_birth_year, 2003, 2015)
          rename refpop refpop_`x'
          save t_`x', replace
          
          use crosswalk, clear
          merge m:1 child_birth_year plate using t_`x'
          drop _merge
          capture erase t_`x'
          save crosswalk, replace
          }
          
          gsort plate child_birth_year
          
          * Result
          list, sep(0)
          Results:
          Code:
          +-------------------------------------------------------------------------------------------------------------------+
          | child_~r plate refpop refpop_1 refpop_2 refpop_3 refpop_4 refpop_5 refpop_6 refpop_7 refpop_8 |
          |-------------------------------------------------------------------------------------------------------------------|
          1. | 2003 1 0 . . . . . . . . |
          2. | 2004 1 0 0 . . . . . . . |
          3. | 2005 1 0 0 0 . . . . . . |
          4. | 2006 1 0 0 0 0 . . . . . |
          5. | 2007 1 0 0 0 0 0 . . . . |
          6. | 2008 1 0 0 0 0 0 0 . . . |
          7. | 2009 1 0 0 0 0 0 0 0 . . |
          8. | 2010 1 0 0 0 0 0 0 0 0 . |
          9. | 2011 1 0 0 0 0 0 0 0 0 0 |
          10. | 2012 1 . 0 0 0 0 0 0 0 0 |
          11. | 2013 1 .2 . 0 0 0 0 0 0 0 |
          12. | 2014 1 .4 .2 . 0 0 0 0 0 0 |
          13. | 2015 1 .6 .4 .2 . 0 0 0 0 0 |
          14. | 2003 2 0 . . . . . . . . |
          15. | 2004 2 0 0 . . . . . . . |
          16. | 2005 2 0 0 0 . . . . . . |
          17. | 2006 2 0 0 0 0 . . . . . |
          18. | 2007 2 0 0 0 0 0 . . . . |
          19. | 2008 2 0 0 0 0 0 0 . . . |
          20. | 2009 2 0 0 0 0 0 0 0 . . |
          21. | 2010 2 0 0 0 0 0 0 0 0 . |
          22. | 2011 2 0 0 0 0 0 0 0 0 0 |
          23. | 2012 2 . 0 0 0 0 0 0 0 0 |
          24. | 2013 2 .5 . 0 0 0 0 0 0 0 |
          25. | 2014 2 .7 .5 . 0 0 0 0 0 0 |
          26. | 2015 2 .9 .7 .5 . 0 0 0 0 0 |
          +-------------------------------------------------------------------------------------------------------------------+
          Dear

          Comment


          • #6
            Originally posted by Ken Chui View Post

            Then it'd be appreciated if this important feature had been brought up at the beginning. That would have saved some time coming up with a useless solution. Since I don't know the data completion status for each plate (city), here is a relative more flexible way that does not rely on row numbers:

            Code:
            clear
            input child_birth_year refpop plate
            2003 0 1
            2004 0 1
            2005 0 1
            2006 0 1
            2007 0 1
            2008 0 1
            2009 0 1
            2010 0 1
            2011 0 1
            2013 0.2 1
            2014 0.4 1
            2015 0.6 1
            2003 0 2
            2004 0 2
            2005 0 2
            2006 0 2
            2007 0 2
            2008 0 2
            2009 0 2
            2010 0 2
            2011 0 2
            2013 0.5 2
            2014 0.7 2
            2015 0.9 2
            end
            
            * Collapse into annual summary
            collapse (mean) refpop, by(child_birth_year plate)
            save crosswalk, replace
            
            forvalue x = 1/8{
            use crosswalk, clear
            drop if child_birth_year == 2016 - `x'
            replace child_birth_year = child_birth_year + `x'
            drop if refpop == .
            drop if !inrange(child_birth_year, 2003, 2015)
            rename refpop refpop_`x'
            save t_`x', replace
            
            use crosswalk, clear
            merge m:1 child_birth_year plate using t_`x'
            drop _merge
            capture erase t_`x'
            save crosswalk, replace
            }
            
            gsort plate child_birth_year
            
            * Result
            list, sep(0)
            Results:
            Code:
            +-------------------------------------------------------------------------------------------------------------------+
            | child_~r plate refpop refpop_1 refpop_2 refpop_3 refpop_4 refpop_5 refpop_6 refpop_7 refpop_8 |
            |-------------------------------------------------------------------------------------------------------------------|
            1. | 2003 1 0 . . . . . . . . |
            2. | 2004 1 0 0 . . . . . . . |
            3. | 2005 1 0 0 0 . . . . . . |
            4. | 2006 1 0 0 0 0 . . . . . |
            5. | 2007 1 0 0 0 0 0 . . . . |
            6. | 2008 1 0 0 0 0 0 0 . . . |
            7. | 2009 1 0 0 0 0 0 0 0 . . |
            8. | 2010 1 0 0 0 0 0 0 0 0 . |
            9. | 2011 1 0 0 0 0 0 0 0 0 0 |
            10. | 2012 1 . 0 0 0 0 0 0 0 0 |
            11. | 2013 1 .2 . 0 0 0 0 0 0 0 |
            12. | 2014 1 .4 .2 . 0 0 0 0 0 0 |
            13. | 2015 1 .6 .4 .2 . 0 0 0 0 0 |
            14. | 2003 2 0 . . . . . . . . |
            15. | 2004 2 0 0 . . . . . . . |
            16. | 2005 2 0 0 0 . . . . . . |
            17. | 2006 2 0 0 0 0 . . . . . |
            18. | 2007 2 0 0 0 0 0 . . . . |
            19. | 2008 2 0 0 0 0 0 0 . . . |
            20. | 2009 2 0 0 0 0 0 0 0 . . |
            21. | 2010 2 0 0 0 0 0 0 0 0 . |
            22. | 2011 2 0 0 0 0 0 0 0 0 0 |
            23. | 2012 2 . 0 0 0 0 0 0 0 0 |
            24. | 2013 2 .5 . 0 0 0 0 0 0 0 |
            25. | 2014 2 .7 .5 . 0 0 0 0 0 0 |
            26. | 2015 2 .9 .7 .5 . 0 0 0 0 0 |
            +-------------------------------------------------------------------------------------------------------------------+
            Dear Professor,

            I have another question if you do not mind.

            Basically, I want to create another variable, let's say fake_treatment. For example, if refpop_1 is 0.1 for children residing in city 1 and born in 2013, I want fake_treatment to take value of 0.1 for all children born in 2003 and reside in city 1. Or, if refpop_1 is 0.2 for children residing in city 2 and born in 2014, I want this fake_treatment to take value of 0.2 for all children born in 2004 and reside in city 2. It goes like that. Precesily, I want 2013 refpop_1 values to be replaced with children born in 2003, 2014 refpop_1 values to be replaced with children born in 2004, 2015 refpop_1 values to be replaced with children born in 2005, 2016 refpop_1 values to be replaced with children born in 2006, 2017 refpop_1 values to be replaced with children born in 2007, and 2018 refpop_1 values to be replaced with children born in 2008 conditional on their city of residence.

            Indeed, there are 81 cities in my dataset (denoted by plate) and I try to create a fake treatment variable based on the forward values of my actual treatment variable. Since there are 81 cities, I could not figure out the exact code. If you have time and can help, I'd be so so happy. Thank you in advance.

            Comment


            • #7
              Basically, I want to create another variable, let's say fake_treatment. For example, if refpop_1 is 0.1 for children residing in city 1 and born in 2013, I want fake_treatment to take value of 0.1 for all children born in 2003 and reside in city 1. Or, if refpop_1 is 0.2 for children residing in city 2 and born in 2014, I want this fake_treatment to take value of 0.2 for all children born in 2004 and reside in city 2. It goes like that.
              Whatever created by the codes in #4 only created a "scheme". You'd need to save the file and the perform a merge to merge it back to the original data set:

              Code:
              * Open your actual data set
              merge m:1 child_birth_year plate using crosswalk
              And all these so-called "fake treatment" will be assigned to each child in your data as a new variable.

              Precesily, I want 2013 refpop_1 values to be replaced with children born in 2003, 2014 refpop_1 values to be replaced with children born in 2004, 2015 refpop_1 values to be replaced with children born in 2005, 2016 refpop_1 values to be replaced with children born in 2006, 2017 refpop_1 values to be replaced with children born in 2007, and 2018 refpop_1 values to be replaced with children born in 2008 conditional on their city of residence.
              I cannot understand what the above is talking about. First, it does not seem to agree with what was asked the sentence before; second, I cannot understand what is meant by "to be replaced with children". From the post #1, it seems the data stopped at 2015, I can't figure out how 2016 - 2018 got into this. I'd suggest i) try to provide a sample data on what is actually wanted and/or ii) restate the question with more precise wording.
              Last edited by Ken Chui; 21 May 2022, 19:48.

              Comment


              • #8
                Originally posted by Ken Chui View Post

                Whatever created by the codes in #4 only created a "scheme". You'd need to save the file and the perform a merge to merge it back to the original data set:

                Code:
                * Open your actual data set
                merge m:1 child_birth_year plate using crosswalk
                And all these so-called "fake treatment" will be assigned to each child in your data as a new variable.



                I cannot understand what the above is talking about. First, it does not seem to agree with what was asked the sentence before; second, I cannot understand what is meant by "to be replaced with children". From the post #1, it seems the data stopped at 2015, I can't figure out how 2016 - 2018 got into this. I'd suggest i) try to provide a sample data on what is actually wanted and/or ii) restate the question with more precise wording.
                Dear Professor,

                First of all, I'd like answer this: "From the post #1, it seems the data stopped at 2015, I can't figure out how 2016 - 2018 got into this"- When I use dataex command, it did not include the observations till the end. Because, the child birth year starts from 2003 and ends in 2018. Therefore, while I was using you code you provided me by simply changing the years:

                collapse (mean) refpop, by(child_birth_year plate)
                save crosswalk, replace

                forvalue x = 1/8{
                use crosswalk, clear
                drop if child_birth_year == 2019 - `x'
                replace child_birth_year = child_birth_year + `x'
                drop if refpop == .
                drop if !inrange(child_birth_year, 2003, 2018)
                rename refpop refpop_`x'
                save t_`x', replace

                use crosswalk, clear
                merge m:1 child_birth_year plate using t_`x'
                drop _merge
                capture erase t_`x'
                save crosswalk, replace
                }

                gsort plate child_birth_year

                Then I merge this dataset with my master dataset (individual level dataset) where I have all my dependent and independent variables. Below, there is an example. Even in this example, due to 100 obs limitation of dataex code, you are unable to see children born in 2018, for example. I try my best to explain my problem here.

                Indeed, I am exploring the effect of refugee inflow exposure on native children's health outcomes as measured by height-for-age z score (haz06). The dataset I use is the DHS. I appended 3 rounds of DHS and as you might be familiar it is a cross sectional dataset. In the appended dataset, the children are born between 2003 to 2018. To understand children's exposure status to the refugee shock, I mainly utilize their year of birth and city of residence at the time of survey. The treatment var (refpop) is caculated as follows: Refpop= Refufee Number(y,c)/Initial popluation(c). Refugee Numbery,c is the number of refugees in city c at the year of birth of children, y. Then, Initial Populationc stands for the population in city c in 2010 as the population data starts from 2010.The refugee shock started in 2011 and city-level refugee number in the host country is available from 2013. In, 2012 it's missing. So I cant have any repop for children born in 2012. The refpop takes value of zero for all children born between 2003 to 2011 as there were no refugees in the host country in these years. As you can predict, since the number of refugees in the country is mainly 0 for children born between 2003 to 2011, refpop variable takes the value of zero for these children. It starts to take its actual values starting from 2013. My aim is to create the forward values of the refpop variable. So, I did it with you code. Next, I need to use these forward values to make a Placebo test. My aim is to define forward values of the treatment variable (refpop_1, refpop_2...) for those who have never treated (this cohort is the children born between 2003 and 2011). This is why I call the variable that I want to create as a fake treatment variable because those never treated will have fake treatment values based on the forward values of the treatment variable. For example, we already know that a child born in 2003 is never treated so this child's actual refpop is 0. But if I can replace the forward value of refpop to this child, then I can use it as a treated one.

                I truly hope it is clear

                Code:
                * Example generated by -dataex-. To install: ssc install dataex
                clear
                input str15 CASEID float(child_birth_year plate) double(haz06 refpop refpop_1 refpop_2 refpop_3 refpop_4 refpop_5 refpop_6 refpop_7 refpop_8)
                "      190414  2" 2003  1                1.17 0 . . . . . . . .
                "      1903 2  2" 2003  1                 .78 0 . . . . . . . .
                "      300213  4" 2003  4                -.97 0 . . . . . . . .
                "      2701 5  2" 2003  5               -1.86 0 . . . . . . . .
                "      151110  2" 2003  6                 .89 0 . . . . . . . .
                "      2008 9  2" 2003  7                -.25 0 . . . . . . . .
                "      2006 9  2" 2003  7                   . 0 . . . . . . . .
                "      100619  2" 2003 16               -1.72 0 . . . . . . . .
                "      1014 2  4" 2003 16                 .25 0 . . . . . . . .
                "      2603 7  2" 2003 19                   . 0 . . . . . . . .
                "      310910  2" 2003 25               -1.34 0 . . . . . . . .
                "      130313  2" 2003 26               -1.69 0 . . . . . . . .
                "      3416 5  2" 2003 27                   . 0 . . . . . . . .
                "      341211  2" 2003 27               -2.14 0 . . . . . . . .
                "      201614  2" 2003 31                 .38 0 . . . . . . . .
                "      2123 7  2" 2003 31                   . 0 . . . . . . . .
                "      202825  2" 2003 33                 .85 0 . . . . . . . .
                "       13719  2" 2003 34               -2.05 0 . . . . . . . .
                "       102 3  2" 2003 34                   . 0 . . . . . . . .
                "       14621  2" 2003 34                -.63 0 . . . . . . . .
                "       137 6  2" 2003 34                -.31 0 . . . . . . . .
                "       13812  2" 2003 34                 .65 0 . . . . . . . .
                "       13018  2" 2003 34                 .02 0 . . . . . . . .
                "       60521  2" 2003 35                   . 0 . . . . . . . .
                "       50210  2" 2003 35                   . 0 . . . . . . . .
                "      250214  2" 2003 37                -.27 0 . . . . . . . .
                "      221022  2" 2003 38               -1.11 0 . . . . . . . .
                "      220523  2" 2003 38                -.25 0 . . . . . . . .
                "      1811 8  2" 2003 42               -1.81 0 . . . . . . . .
                "      1601 3  2" 2003 42                   . 0 . . . . . . . .
                "      3213 2  2" 2003 44                -.45 0 . . . . . . . .
                "       60819  2" 2003 45                2.52 0 . . . . . . . .
                "      203414  2" 2003 46                -.74 0 . . . . . . . .
                "      290813  9" 2003 52                 .23 0 . . . . . . . .
                "      3619 2  2" 2003 56               -1.74 0 . . . . . . . .
                "      331713  4" 2003 65                 .03 0 . . . . . . . .
                "      222417  2" 2003 68                   . 0 . . . . . . . .
                "      191013  2" 2004  1               -1.87 0 0 . . . . . . .
                "      1915 4  2" 2004  1                 -.2 0 0 . . . . . . .
                "      1911 8  2" 2004  1                -.11 0 0 . . . . . . .
                "      191625  2" 2004  1                 -.8 0 0 . . . . . . .
                "      2001 5  3" 2004  1                 .39 0 0 . . . . . . .
                "      2101 4  2" 2004  1               -1.71 0 0 . . . . . . .
                "      191118  4" 2004  1                -.36 0 0 . . . . . . .
                "      191010  2" 2004  1                 .32 0 0 . . . . . . .
                "      2103 8  2" 2004  1                1.84 0 0 . . . . . . .
                "      1909 1  2" 2004  1                   0 0 0 . . . . . . .
                "      210314  2" 2004  1                -.37 0 0 . . . . . . .
                "      191313  2" 2004  1                -.87 0 0 . . . . . . .
                "      1911 5  2" 2004  1               -2.14 0 0 . . . . . . .
                "      1901 6  2" 2004  1               -1.83 0 0 . . . . . . .
                "      191422  2" 2004  1                1.09 0 0 . . . . . . .
                "      2104 8  2" 2004  1                -.15 0 0 . . . . . . .
                "      191124  2" 2004  1               -1.29 0 0 . . . . . . .
                "      1911 6  2" 2004  1                -.15 0 0 . . . . . . .
                "      350210  2" 2004  2                -.22 0 . . . . . . . .
                "      3502 7  2" 2004  2                -.86 0 . . . . . . . .
                "      3602 7  2" 2004  2               -3.08 0 . . . . . . . .
                "      3501 2  2" 2004  2                   . 0 . . . . . . . .
                "      360412  2" 2004  2                 .27 0 . . . . . . . .
                "      360215  5" 2004  2               -1.79 0 . . . . . . . .
                "       801 3  2" 2004  3                 .44 0 . . . . . . . .
                "      300424  4" 2004  4                -.31 0 0 . . . . . . .
                "      300418  2" 2004  4                   . 0 0 . . . . . . .
                "      300322  4" 2004  4               -5.33 0 0 . . . . . . .
                "      300316  2" 2004  4               -1.43 0 0 . . . . . . .
                "      3103 5  4" 2004  4               -3.12 0 0 . . . . . . .
                "      3005 8  2" 2004  4               -2.44 0 0 . . . . . . .
                "      300321  2" 2004  4               -1.23 0 0 . . . . . . .
                "      300210  5" 2004  4                -.72 0 0 . . . . . . .
                "      3101 2  3" 2004  4                 -.6 0 0 . . . . . . .
                "      300517  6" 2004  4                -2.2 0 0 . . . . . . .
                "      300517  4" 2004  4               -5.97 0 0 . . . . . . .
                "      3102 3  3" 2004  4                   . 0 0 . . . . . . .
                "      300320  1" 2004  4               -2.47 0 0 . . . . . . .
                "      3005 2  2" 2004  4               -3.72 0 0 . . . . . . .
                "      3003 3  2" 2004  4               -3.63 0 0 . . . . . . .
                "      300313  2" 2004  4               -1.28 0 0 . . . . . . .
                "      2701 4  2" 2004  5                -.15 0 0 . . . . . . .
                "      151524  2" 2004  6                1.42 0 0 . . . . . . .
                "      150213  2" 2004  6                -.15 0 0 . . . . . . .
                "      150310  2" 2004  6                   . 0 0 . . . . . . .
                "      151721  2" 2004  6                 .33 0 0 . . . . . . .
                "      151123  2" 2004  6               -1.68 0 0 . . . . . . .
                "      150816  2" 2004  6                -.49 0 0 . . . . . . .
                "      151923  2" 2004  6               -2.85 0 0 . . . . . . .
                "      1517 1  2" 2004  6                -.58 0 0 . . . . . . .
                "      1519 6  2" 2004  6   .9500000000000001 0 0 . . . . . . .
                "      1520 7  4" 2004  6                 .06 0 0 . . . . . . .
                "      151523  2" 2004  6                   . 0 0 . . . . . . .
                "      1512 8  2" 2004  6                 .15 0 0 . . . . . . .
                "      1510 6  2" 2004  6                   . 0 0 . . . . . . .
                "      200424  2" 2004  7                   . 0 0 . . . . . . .
                "      2003 6  2" 2004  7                -.66 0 0 . . . . . . .
                "      201211  2" 2004  7                -.98 0 0 . . . . . . .
                "      2111 6  2" 2004  7               -1.46 0 0 . . . . . . .
                "      200211  2" 2004  7 -.41000000000000003 0 0 . . . . . . .
                "      200513  2" 2004  7                 -.6 0 0 . . . . . . .
                "      2007 8  2" 2004  7                   . 0 0 . . . . . . .
                "      2003 2  2" 2004  7                 .31 0 0 . . . . . . .
                end


                Comment

                Working...
                X