Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • xthdidregress event studies- baseline(common) feature

    Hi everyone,
    I've been using xthdidregress for some staggered difference-in-difference projects in order to take advantage of the Callaway Sant'Anna estimator. However, as has been noted previously, there have been issues with the event studies produced by the command, specifically that the pretend ATT estimates were not fixed to t-1 (period before treatment) as is the standard in event study presentations. I've played around with csdid and understand I can use the long2 option in that command to produce what I'm looking for, but am curious to compare to the native Stata command. The data I'm using is in a restricted environment, so will use the sample data to demonstrate:

    Code:
    use https://www.stata-press.com/data/r18/akc
    xtset breed year
    quietly xthdidregress ra (registered) (movie), group(breed) 
    estat aggregation, dynamic
    
    . use https://www.stata-press.com/data/r18/akc, clear
    (Fictional dog breed and AKC registration data)
    
    . xtset breed year
    
    Panel variable: breed (strongly balanced)
     Time variable: year, 2031 to 2040
             Delta: 1 unit
    
    . quietly xthdidregress ra (registered) (movie), group(breed) 
    
    . estat aggregation, dynamic
    
    Duration of exposure ATET                                Number of obs = 1,410
    
                                    (Std. err. adjusted for 141 clusters in breed)
    ------------------------------------------------------------------------------
                 |               Robust
        Exposure |       ATET   std. err.      z    P>|z|     [95% conf. interval]
    -------------+----------------------------------------------------------------
              -5 |  -91.45434    159.468    -0.57   0.566    -404.0058    221.0971
              -4 |  -92.56303   153.2523    -0.60   0.546     -392.932     207.806
              -3 |  -10.44304   152.6681    -0.07   0.945     -309.667    288.7809
              -2 |   17.20474   142.2066     0.12   0.904    -261.5151    295.9245
              -1 |   32.04316   130.2186     0.25   0.806    -223.1806     287.267
               0 |   1441.897   205.4639     7.02   0.000     1039.195    1844.599
               1 |   1749.923    221.871     7.89   0.000     1315.064    2184.782
               2 |   2167.487   218.3947     9.92   0.000     1739.441    2595.533
               3 |   2653.018   284.8181     9.31   0.000     2094.785    3211.252
               4 |    2372.01   274.5426     8.64   0.000     1833.916    2910.103
               5 |   2663.019   528.9573     5.03   0.000     1626.282    3699.756
               6 |   3087.811   587.8304     5.25   0.000     1935.685    4239.937
    ------------------------------------------------------------------------------
    So in the standard command, pre-treatment ATTs are not normed to t-1. In the past, I had used a user-written command called eventbaseline to normalize these, which would produce output like the following:

    Code:
    . quietly xthdidregress ra (registered) (movie), group(breed) 
    
    . quietly estat aggregation, dynamic
    
    . eventbaseline, pre(5) post(5) baseline(-1)
    
    Event study relative to -1               Number of obs = 1,391
    
    ------------------------------------------------------------------------------
      registered |       ATET   Std. err.      z    P>|z|     [95% conf. interval]
    -------------+----------------------------------------------------------------
              -5 |   53.75817   147.2466     0.37   0.715    -234.8398    342.3561
              -4 |  -38.80486   165.1085    -0.24   0.814    -362.4115    284.8018
              -3 |   -49.2479   176.8549    -0.28   0.781     -395.877    297.3812
              -2 |  -32.04316   130.2186    -0.25   0.806     -287.267    223.1806
              -1 |          0  (omitted)
               0 |   1441.897   205.4639     7.02   0.000     1039.195    1844.599
               1 |   1749.923    221.871     7.89   0.000     1315.064    2184.782
               2 |   2167.487   218.3947     9.92   0.000     1739.441    2595.533
               3 |   2653.018   284.8181     9.31   0.000     2094.785    3211.252
               4 |    2372.01   274.5426     8.64   0.000     1833.916    2910.103
               5 |   2663.019   528.9573     5.03   0.000     1626.282    3699.756
    ------------------------------------------------------------------------------
    I'm not sure how reliable this command was, so I was excited to see in another post there was an option called basetime(common) that was just added which should fix this and norm the pre-period ATTs to t-1.

    Code:
    quietly xthdidregress ra (registered) (movie), group(breed) basetime(common)
    
    . estat aggregation, dynamic
    
    Duration of exposure ATET                                Number of obs = 1,410
    
                                    (Std. err. adjusted for 141 clusters in breed)
    ------------------------------------------------------------------------------
                 |               Robust
        Exposure |       ATET   std. err.      z    P>|z|     [95% conf. interval]
    -------------+----------------------------------------------------------------
              -6 |   79.31597   198.9669     0.40   0.690    -310.6521     469.284
              -5 |  -80.20635   143.2517    -0.56   0.576    -360.9746    200.5619
              -4 |  -172.7694    169.218    -1.02   0.307    -504.4305    158.8917
              -3 |   -49.2479   176.8549    -0.28   0.781     -395.877    297.3812
              -2 |  -32.04316   130.2186    -0.25   0.806     -287.267    223.1806
               0 |   1441.897   205.4639     7.02   0.000     1039.195    1844.599
               1 |   1749.923    221.871     7.89   0.000     1315.064    2184.782
               2 |   2167.487   218.3947     9.92   0.000     1739.441    2595.533
               3 |   2653.018   284.8181     9.31   0.000     2094.785    3211.252
               4 |    2372.01   274.5426     8.64   0.000     1833.916    2910.103
               5 |   2663.019   528.9573     5.03   0.000     1626.282    3699.756
               6 |   3087.811   587.8304     5.25   0.000     1935.685    4239.937
    ------------------------------------------------------------------------------
    Note: Base time for pretreatment ATETs is the last pretreatment period.
    Note: Exposure is the number of periods since the first treatment time.
    I ran it, and as we can see t-1 is now omitted. However, the pre-treatment ATTs do not line up with eventbaseline (above), and in addition I'm confused as to why there is now an exposure -6 time period when in the original dynamic results (first code example) there are no estimates for anything before t-5. Has anyone run into this when using the command?

    In addition, how confident should we feel that this is similar enough to the long2 option in csdid/csdid2? My original data showed some discrepancies there as well, and I can try to work up some sample data here to demonstrate if that's the case.

    Thank you!


  • #2
    I would like to second this question. I have encountered a similar issue and have questions regarding "estat aggregation" following "xthdidregress" with adaptive and common basetime versus traditional event-study plots.

    One observation is that the coefficient and SE on t=-2 from the dynamic aggregation with common basetime and the coefficient and SE at t=-1 for the "eventbaseline" result are the the same. Additionally, the coefficient is the same (with opposite sign) and SE for t=-1 from the dynamic aggregation with adaptive basetime.

    I am curious if some of the experts have any thoughts Jeff Wooldridge FernandoRios

    Comment


    • #3
      Hello Andy and Tim,

      With regard to differences between -csdid- and -xthdidregress-, in your example there are no differences. I would need to look at your specific data to identify the differences between the two commands. Below is the code I used followed by the output:

      Code:
      cls
      clear 
      webuse akc
      xtset breed year 
      quietly xthdidregress ra (registered) (movie), group(breed) basetime(common)
      estat aggregation, dynamic
      quietly csdid registered, ivar(breed) time(year) gvar(_did_cohort ) ///
          long2
      estat event
      Which yields the following output

      Code:
      . clear 
      
      . webuse akc
      (Fictional dog breed and AKC registration data)
      
      . xtset breed year 
      
      Panel variable: breed (strongly balanced)
       Time variable: year, 2031 to 2040
               Delta: 1 unit
      
      . quietly xthdidregress ra (registered) (movie), group(breed) basetime(common)
      
      . estat aggregation, dynamic
      
      Duration of exposure ATET                                Number of obs = 1,410
      
                                      (Std. err. adjusted for 141 clusters in breed)
      ------------------------------------------------------------------------------
                   |               Robust
          Exposure |       ATET   std. err.      z    P>|z|     [95% conf. interval]
      -------------+----------------------------------------------------------------
                -6 |   79.31597   198.9669     0.40   0.690    -310.6521     469.284
                -5 |  -80.20635   143.2517    -0.56   0.576    -360.9746    200.5619
                -4 |  -172.7694    169.218    -1.02   0.307    -504.4305    158.8917
                -3 |   -49.2479   176.8549    -0.28   0.781     -395.877    297.3812
                -2 |  -32.04316   130.2186    -0.25   0.806     -287.267    223.1806
                 0 |   1441.897   205.4639     7.02   0.000     1039.195    1844.599
                 1 |   1749.923    221.871     7.89   0.000     1315.064    2184.782
                 2 |   2167.487   218.3947     9.92   0.000     1739.441    2595.533
                 3 |   2653.018   284.8181     9.31   0.000     2094.785    3211.252
                 4 |    2372.01   274.5426     8.64   0.000     1833.916    2910.103
                 5 |   2663.019   528.9573     5.03   0.000     1626.282    3699.756
                 6 |   3087.811   587.8304     5.25   0.000     1935.685    4239.937
      ------------------------------------------------------------------------------
      Note: Base time for pretreatment ATETs is the last pretreatment period.
      Note: Exposure is the number of periods since the first treatment time.
      
      . quietly csdid registered, ivar(breed) time(year) gvar(_did_cohort ) ///
      >         long2
      
      . estat event
      ATT by Periods Before and After treatment
      Event Study:Dynamic effects
      ------------------------------------------------------------------------------
                   | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
      -------------+----------------------------------------------------------------
           Pre_avg |  -50.99016   134.9794    -0.38   0.706    -315.5449    213.5646
          Post_avg |   2305.024   165.4701    13.93   0.000     1980.708    2629.339
               Tm6 |   79.31597   198.9669     0.40   0.690    -310.6521     469.284
               Tm5 |  -80.20635   143.2517    -0.56   0.576    -360.9746    200.5619
               Tm4 |  -172.7694    169.218    -1.02   0.307    -504.4305    158.8917
               Tm3 |   -49.2479   176.8549    -0.28   0.781     -395.877    297.3812
               Tm2 |  -32.04316   130.2186    -0.25   0.806     -287.267    223.1806
               Tp0 |   1441.897   205.4639     7.02   0.000     1039.195    1844.599
               Tp1 |   1749.923    221.871     7.89   0.000     1315.064    2184.782
               Tp2 |   2167.487   218.3947     9.92   0.000     1739.441    2595.533
               Tp3 |   2653.018   284.8181     9.31   0.000     2094.785    3211.252
               Tp4 |    2372.01   274.5426     8.64   0.000     1833.916    2910.103
               Tp5 |   2663.019   528.9573     5.03   0.000     1626.282    3699.756
               Tp6 |   3087.811   587.8304     5.25   0.000     1935.685    4239.937
      ------------------------------------------------------------------------------
      That being said, one possible source of differences is that -csdid- asks that you give it a cohort variable whereas -xthdidregress- constructs it for you. With gaps in your data this might create differences. We have added a new option -usercohort()- that allows you to provide your own cohort variable, which circunvents this issue. Additionally, we have a new command -gencohort- that will generate that cohort variable for you and check that the assumptions needed are met.

      Comment


      • #4
        I don't have a lot to add, but I can confirm that, generally, the following commands give the same answers in the balanced case, without gaps in the time, and no controls (or time-constant controls). The variable id is the cross-sectional identifier and I'm assuming t is the time identifier numbered consecutively from start to finish -- such as year without gaps. The variable cohort is zero for a never treated unit and equal to the first time period of treatment for a treated unit. The variable w is the time-varying treatment indicator (that goes from zero to one for a treated unit):

        Code:
        xtset id t
        xthdidregress ra (y) (w), group(id) basetime(common)
        Code:
          
         jwdid y, ivar(id) tvar(t) gvar(cohort) never csdid y, ivar(id) time(t), gvar(cohort) long2  


        All produce the traditional event study plot (weighted by cohort shares). Other variations do not. If you use jwdid and drop the never option then the estimator is "lags only" and the actual treatment effects will change.


        Comment

        Working...
        X