Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to create variables reflecting Differences in Means over different time periods in the panel dataset?

    Dear Colleagues,
    I am struggling in my panel dataset to calculate variables reflecting differences in means over different time periods.
    I have my ID (ICO) and YEAR (ROK) continuous outcome variables, and I need to calculate a new variable reflecting the differences in the mean values of years (2013-2018) and mean values of years (2007-2012).
    I tried to use the following codes, but they did not work well - the differences in my mean outcome variables (ROA) provide empty values because there are different time windows. Any clue how to overcome this problem?
    Thanks a lot for your help and suggestions,

    Ondřej

    bys ICO: egen ROA_1318 = mean(ROA) if inrange(ROK,2013,2018)
    by ICO: egen ROA_1318 = mean(ROA) if ROK > 2012 & ROK < 2019
    by ICO: egen ROA_0812 = mean(ROA) if ROK > 2007 & ROK < 2013
    gen ROA_DIF = ROA_1318 - ROA_0812
    egen ROA_DIF = rowtotal(ROA_1318 ROA_0812_m)
    replace ROA_DIF = . if ROA_DIF==0

    bys ICO: egen ROA_1318 = mean(ROA) if inrange(ROK,2013,2018)
    bys ICO: egen ROA_0812 = mean(ROA) if inrange(ROK,2007,2012)
    by ICO: gen ROA_DIF = ROA_1318 - ROA_0812

    Example of my panel data structure (continuous variables missing)...

    ROK Sektor ICO NAZEV CONTINUOUS VARIABLES .....
    2003 2 205 Vojenské lesy a statky ČR, s.p.
    2004 2 205 Vojenské lesy a statky ČR, s.p.
    2005 2 205 Vojenské lesy a statky ČR, s.p.
    2006 2 205 Vojenské lesy a statky ČR, s.p.
    2007 2 205 Vojenské lesy a statky ČR, s.p.
    2008 2 205 Vojenské lesy a statky ČR, s.p.
    2009 2 205 Vojenské lesy a statky ČR, s.p.
    2010 2 205 Vojenské lesy a statky ČR, s.p.




  • #2
    Ondrej:
    do you mean something along the following lines?
    Code:
    use https://www.stata-press.com/data/r17/nlswork.dta
    . bysort idcode (year): gen wanted_1= ln_wage[_n]-ln_wage[_n-1] if year<=77
    . bysort idcode (year): gen wanted_2= ln_wage[_n]-ln_wage[_n-1] if year>77 & year<.
    . list idcode year ln_wage wanted_1 wanted_2 in 1/20
    
         +--------------------------------------------------+
         | idcode   year    ln_wage    wanted_1    wanted_2 |
         |--------------------------------------------------|
      1. |      1     70   1.451214           .           . |
      2. |      1     71    1.02862   -.4225942           . |
      3. |      1     72   1.589977    .5613576           . |
      4. |      1     73   1.780273    .1902955           . |
      5. |      1     75   1.777012   -.0032612           . |
         |--------------------------------------------------|
      6. |      1     77   1.778681    .0016689           . |
      7. |      1     78   2.493976           .    .7152953 |
      8. |      1     80   2.551715           .    .0577395 |
      9. |      1     83   2.420261           .    -.131454 |
     10. |      1     85   2.614172           .    .1939111 |
         |--------------------------------------------------|
     11. |      1     87   2.536374           .   -.0777988 |
     12. |      1     88   2.462927           .   -.0734463 |
     13. |      2     71   1.360348           .           . |
     14. |      2     72   1.206198   -.1541507           . |
     15. |      2     73   1.549883    .3436854           . |
         |--------------------------------------------------|
     16. |      2     75   1.832581    .2826983           . |
     17. |      2     77   1.726721   -.1058601           . |
     18. |      2     78    1.68991           .    -.036811 |
     19. |      2     80   1.726964           .    .0370541 |
     20. |      2     82   1.808289           .    .0813245 |
         +--------------------------------------------------+
    
    .
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Dear Carlo,

      thanks a lot, yes, this is where I am more or less able to get, but I need to get a bit even further:
      1) My Wanted_1 and Wanted_2 periods are fixed and I need to end-up by having mean values of Wanted_1 and Wanted_2
      2) From Wanted_1 and Wanted_2 I need to calculate the third variable, Wanted_Difference = Wanted_2 - Wanted_1 which I need to use later in my estimates of DID.

      Would be grateful for your insights.

      .
      [/CODE][/QUOTE]

      Comment


      • #4
        Ondrej:
        I do hope that what follows may be of some help:

        Code:
        . use https://www.stata-press.com/data/r17/nlswork.dta
        (National Longitudinal Survey of Young Women, 14-24 years old in 1968)
        
        . bysort idcode (year): egen wanted_1= mean(ln_wage) if year<=77
        
        . bysort idcode (year): egen wanted_2= mean(ln_wage) if year>77 & year<.
        
        . gen wanted_group=0 if wanted_1!=.
        
        . replace wanted_group=1 if wanted_2!=.
        
        . egen wanted_total=rowtotal( wanted_1 wanted_2)
        
        . ttest wanted_total,by( wanted_group )
        
        Two-sample t test with equal variances
        ------------------------------------------------------------------------------
           Group |     Obs        Mean    Std. err.   Std. dev.   [95% conf. interval]
        ---------+--------------------------------------------------------------------
               0 |  14,130    1.559541    .0028964    .3442965    1.553863    1.565218
               1 |  14,404    1.788079    .0036461    .4375968    1.780932    1.795226
        ---------+--------------------------------------------------------------------
        Combined |  28,534    1.674907    .0024295    .4103864    1.670145    1.679669
        ---------+--------------------------------------------------------------------
            diff |           -.2285381    .0046671               -.2376859   -.2193904
        ------------------------------------------------------------------------------
            diff = mean(0) - mean(1)                                      t = -48.9680
        H0: diff = 0                                     Degrees of freedom =    28532
        
            Ha: diff < 0                 Ha: diff != 0                 Ha: diff > 0
         Pr(T < t) = 0.0000         Pr(|T| > |t|) = 0.0000          Pr(T > t) = 1.0000
        
        .
        Last edited by Carlo Lazzaro; 26 Oct 2022, 08:17.
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment


        • #5
          Carlo, thank you very much, this is going further, but I really need one variable containing the difference because then I use it in kmatch command (i.e., propensity score matching), I tried the approach with the groups, but the panel did not let me to end-up in a single variable, even when trying to make several interactions.

          Comment

          Working...
          X