Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Small Sample Inference for DID

    I've been curious as of late about how to estimate SEs for small samples in DID. Say we estimate
    Code:
    clear *
    net from "https://raw.githubusercontent.com/jgreathouse9/FDIDTutorial/main"
    net install fdid, replace
    u basque, clear
    fdid gdpcap, tr(treat) gr2opts(scheme(sj))
    
    mkf newframe
    
    cwf newframe
    cls
    svmat e(series), names(col)
    g time = _n
    su time if eventt==0
    
    loc lastneg = r(mean)-1
    bootstrap, nodrop: reg te5 ib(`lastneg').time, nocons
    my goal in the above code is to estimate bootstrap standard errors for each individual treatment effect. Yet, this is returned
    Code:
    . bootstrap, nodrop: reg te5 ib(`lastneg').time, nocons
    (running regress on estimation sample)
    
    Bootstrap replications (50)
    ----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5 
    xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx    50
    insufficient observations to compute bootstrap standard errors
    no results will be saved
    r(2000);
    
    end of do-file
    
    r(2000);
    Naturally, I'd expect the bootstrap SEs to be returned. If I get rid of the bootstrapping, we get

    Code:
    . reg te5 ib(`lastneg').time, nocons
    
          Source |       SS           df       MS      Number of obs   =        43
    -------------+----------------------------------   F(42, 1)        =    197.06
           Model |  20.7657675        42  .494423035   Prob > F        =    0.0565
        Residual |  .002508966         1  .002508966   R-squared       =    0.9999
    -------------+----------------------------------   Adj R-squared   =    0.9948
           Total |  20.7682764        43  .482983173   Root MSE        =    .05009
    
    ------------------------------------------------------------------------------
             te5 | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
    -------------+----------------------------------------------------------------
            time |
              1  |   .0952546   .0500896     1.90   0.308    -.5411939    .7317031
              2  |   .0376283   .0500896     0.75   0.590    -.5988202    .6740768
              3  |  -.0217828   .0500896    -0.43   0.739    -.6582313    .6146656
              4  |  -.0741609   .0500896    -1.48   0.378    -.7106094    .5622876
              5  |  -.1258603   .0500896    -2.51   0.241    -.7623088    .5105882
              6  |  -.1159347   .0500896    -2.31   0.260    -.7523832    .5205138
              7  |   -.103331   .0500896    -2.06   0.287    -.7397795    .5331174
              8  |  -.0398849   .0500896    -0.80   0.572    -.6763334    .5965636
              9  |   .0090296   .0500896     0.18   0.886    -.6274189    .6454781
             10  |   .0795808   .0500896     1.59   0.358    -.5568677    .7160293
             11  |   .1404565   .0500896     2.80   0.218     -.495992     .776905
             12  |   .0977903   .0500896     1.95   0.301    -.5386582    .7342388
             13  |   .0518745   .0500896     1.04   0.489    -.5845739     .688323
             14  |   .0529453   .0500896     1.06   0.482    -.5835032    .6893938
             15  |   .0453055   .0500896     0.90   0.532     -.591143     .681754
             16  |  -.0016811   .0500896    -0.03   0.979    -.6381296    .6347674
             17  |  -.0322798   .0500896    -0.64   0.636    -.6687283    .6041687
             18  |  -.0548448   .0500896    -1.09   0.471    -.6912933    .5816037
             19  |  -.0901919   .0500896    -1.80   0.323    -.7266404    .5462566
             21  |   .1747319   .0500896     3.49   0.178    -.4617166    .8111804
             22  |  -.0432762   .0500896    -0.86   0.546    -.6797247    .5931723
             23  |  -.2550735   .0500896    -5.09   0.123     -.891522     .381375
             24  |  -.5257106   .0500896   -10.50   0.060    -1.162159    .1107379
             25  |  -.6823086   .0500896   -13.62   0.047    -1.318757   -.0458601
             26  |  -.8041667   .0500896   -16.05   0.040    -1.440615   -.1677182
             27  |  -.9361285   .0500896   -18.69   0.034    -1.572577     -.29968
             28  |  -1.005573   .0500896   -20.08   0.032    -1.642022   -.3691249
             29  |  -1.074268   .0500896   -21.45   0.030    -1.710716   -.4378194
             30  |  -1.007323   .0500896   -20.11   0.032    -1.643771   -.3708741
             31  |  -.9358075   .0500896   -18.68   0.034    -1.572256    -.299359
             32  |  -1.010143   .0500896   -20.17   0.032    -1.646591   -.3736942
             33  |  -1.068733   .0500896   -21.34   0.030    -1.705182    -.432285
             34  |  -1.149782   .0500896   -22.95   0.028    -1.786231   -.5133338
             35  |  -1.214765   .0500896   -24.25   0.026    -1.851213   -.5783162
             36  |   -1.18513   .0500896   -23.66   0.027    -1.821578    -.548681
             37  |  -1.174418   .0500896   -23.45   0.027    -1.810866   -.5379694
             38  |   -1.11872   .0500896   -22.33   0.028    -1.755169   -.4822716
             39  |  -1.063022   .0500896   -21.22   0.030     -1.69947   -.4265732
             40  |  -1.112293   .0500896   -22.21   0.029    -1.748742   -.4758449
             41  |  -.9926846   .0500896   -19.82   0.032    -1.629133   -.3562361
             42  |  -.9901853   .0500896   -19.77   0.032    -1.626634   -.3537368
             43  |  -.9516248   .0500896   -19.00   0.033    -1.588073   -.3151763
    ------------------------------------------------------------------------------

    We can, however, redo this exact same estimation, using xtreg. This is the same model with a little more legwork.
    Code:
    clear *
    
    u basque, clear
    
    tempvar cohort
    
    bys id: egen `cohort' = min(year) if treat==1
    
    bys id: egen cohort = max(`cohort')
    
    g event = year-cohort
    
    bys id: g time = _n
    
    replace time = 0 if missing(cohort)
    
    summ time
    g shifted_ttt = time - r(min)
    summ shifted_ttt if event == 0
    local true_neg1 = r(mean)-1
    cls
    * Regress on our interaction terms with FEs for group and year,
    * clustering at the group (state) level
    * use ib# to specify our reference group
    xtreg gdpcap ib(`true_neg1').shifted_ttt i.year if inlist(id,2,5,10), fe vce(bootstrap)
    
    
    * Pull out the coefficients and SEs
    g coef = .
    g se = .
    levelsof shifted_ttt, l(times)
    foreach t in `times' {
        replace coef = _b[`t'.shifted_ttt] if shifted_ttt == `t'
        replace se = _se[`t'.shifted_ttt] if shifted_ttt == `t'
    }
    
    * Make confidence intervals
    g ci_top = coef+1.96*se
    g ci_bottom = coef - 1.96*se
    
    * Limit ourselves to one observation per quarter
    * now switch back to time_to_treat to get original timing
    keep event coef se ci_*
    duplicates drop
    
    sort event
    
    * Create connected scatterplot of coefficients
    * with CIs included with rcap
    * and a line at 0 both horizontally and vertically
    summ ci_top
    local top_range = r(max)
    summ ci_bottom
    local bottom_range = r(min)
    
    twoway (sc coef event, connect(line)) ///
        (rcap ci_top ci_bottom event), ///
        xtitle("Time to Treatment") caption("95% Confidence Intervals Shown") ///
        scheme(sj) xli(-1, lpat(dash)) yli(0)
    My question, then, is how can I estimate bootstrapped SEs for the first block of code I provided? That is, we must estimate the SE for all periods, with the exception of the time period right before the treatment begins. Do I need to write a custom bootstrap program to do this? Or what other options might I have?

  • #2
    Look at fect (starting at line 580). It appears it uses bsample and then re-estimates the model repeatedly (no MATA). with fdid, that might take some time.

    Comment


    • #3
      Yeah George Ford this was actually the idea I'd come up with (few days ago I think). I'm kind of debating on going the extra mile and just having fect handle the final estimation on the reduced donor pool. Of course, I'd need to redo a little of fdid's syntax under the hood so that it respects fect's data structure requirements. But, I suspect this would be the most convenient way of doing things. After all, when I estimate
      Code:
      fect cigsale if inlist(id,3,4,5,19,21), ///
      treat(treated) unit(id) time(year) se nboots(500) ///
      vartype("bootstrap")
      the returned point estimate is -13.64671, the exact same thing FDID currently returns. At that point, one could likely use the e(ATTs) matrix to combine N_1 treated units together, and calculate Cohort ATTs and standard errors, and event study plots without much effort. So, I'll consider it! Thing is, Kathy didn't do this in her original paper and (for Stata Journal purposes anyways), I can see reviewers complaining about how this method wasn't the one originally done in the paper so it would need more validation (i.e., simulation, etc). On the other hand, it would be the most straightforward way to approach this, I think.

      Comment


      • #4
        That's one interesting thing about fdid. It's basically choosing a control group and then proceeding in a fairly straightforward way. If I can find the time, then I'll try to implement the fect style into fdid and send it along.

        Comment

        Working...
        X