CSDID and DID regressions produce different standard errors

Alessandro Bafaro

Join Date: Aug 2023
Posts: 3

CSDID and DID regressions produce different standard errors

04 Nov 2023, 11:45

Hello. I am using the staggered DID model (Callaway and Sant'Anna 2021) using the csdid package and I am comparing its results to several DID regressions.

I don't understand why DID regressions and the csdid command produce different standard errors.

Here are the results of the csdid command:

Code:

. encode comune, gen(idcomune)

. csdid vote_share, ivar(idcomune) time(period) gvar(first_treated) reg
Units always treated found. These will be ignored
Panel is not balanced
Will use observations with Pair balanced (observed at t0 and t1)
................
Difference-in-difference with Multiple Time Periods

                                                        Number of obs = 38,042
Outcome model  : regression adjustment
Treatment model: none
------------------------------------------------------------------------------
             | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
g2           |
       t_1_2 |   .0161844   .0065602     2.47   0.014     .0033267    .0290421
       t_1_3 |    .079245    .022349     3.55   0.000     .0354418    .1230483
       t_1_4 |   .0790281   .0241041     3.28   0.001     .0317849    .1262712
       t_1_5 |   .0617656   .0208679     2.96   0.003     .0208653    .1026658
-------------+----------------------------------------------------------------
g3           |
       t_1_2 |   .0210849    .004212     5.01   0.000     .0128295    .0293402
       t_2_3 |    .040359   .0082862     4.87   0.000     .0241184    .0565996
       t_2_4 |   .0162996   .0067423     2.42   0.016      .003085    .0295142
       t_2_5 |   .0303758   .0085279     3.56   0.000     .0136614    .0470902
-------------+----------------------------------------------------------------
g4           |
       t_1_2 |  -.0439547   .0005086   -86.42   0.000    -.0449515   -.0429578
       t_2_3 |   .0564394   .0009929    56.85   0.000     .0544934    .0583853
       t_3_4 |  -.0692106   .0005165  -134.00   0.000    -.0702229   -.0681983
       t_3_5 |  -.0485215   .0006746   -71.92   0.000    -.0498437   -.0471992
-------------+----------------------------------------------------------------
g5           |
       t_1_2 |   -.014802   .0034881    -4.24   0.000    -.0216386   -.0079655
       t_2_3 |  -.0067809   .0046161    -1.47   0.142    -.0158283    .0022665
       t_3_4 |    .000345   .0074088     0.05   0.963     -.014176     .014866
       t_4_5 |   .0039694   .0074588     0.53   0.595    -.0106496    .0185884
------------------------------------------------------------------------------
Control: Never Treated

See Callaway and Sant'Anna (2021) for details

I now try to imitate "g4 t_2_3" using a normal DID regression.

I import a different dataset extrapolated for the dataset used with the csdid command. This new dataset only has observations in periods 2 and 3 for those municipalities treated in period 4 for the first time.

I first delete unbalanced observations as the csdid command would do:

Code:

. egen var1 = count(vote_share), by(comune)

. keep if var1==2
(614 observations deleted)

Then create a dummy for each of the two periods

Code:

. tab period, gen(dummyP)

period |
         |      Freq.     Percent        Cum.
------------+-----------------------------------
          2 |      7,462       50.00       50.00
          3 |      7,462       50.00      100.00
------------+-----------------------------------
      Total |     14,924      100.00

And then run the DID regression:

Code:

. reg vote_share ever_treated##dummyP2

      Source |       SS           df       MS      Number of obs   =    14,924
-------------+----------------------------------   F(3, 14920)     =   1727.22
       Model |  40.7331805         3  13.5777268   Prob > F        =    0.0000
    Residual |  117.286738    14,920  .007861041   R-squared       =    0.2578
-------------+----------------------------------   Adj R-squared   =    0.2576
       Total |  158.019919    14,923  .010589018   Root MSE        =    .08866

--------------------------------------------------------------------------------------
          vote_share | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
---------------------+----------------------------------------------------------------
      1.ever_treated |    .039173   .0886685     0.44   0.659    -.1346281    .2129741
           1.dummyP2 |   .1044656   .0014516    71.96   0.000     .1016202    .1073109
                     |
ever_treated#dummyP2 |
                1 1  |   .0564394   .1253961     0.45   0.653    -.1893525    .3022312
                     |
               _cons |   .1794675   .0010265   174.84   0.000     .1774555    .1814795
--------------------------------------------------------------------------------------

The resulting coefficient is the same: .0564394
But standard errors are different: .0009929 in the csdid, .1253961 in the did reg.

This always happens. Csdid's SEs are always different, either smaller (like in this case) or bigger.

For example, in the g2 t_1_4, csdid gives a bigger SE: .0241041
The DID' regression' SE is .0212055, smaller.

Code:

. reg vote_share ever_treated##dummyP2

      Source |       SS           df       MS      Number of obs   =    14,642
-------------+----------------------------------   F(3, 14638)     =   1249.40
       Model |  22.6698886         3  7.55662952   Prob > F        =    0.0000
    Residual |  88.5335856    14,638  .006048202   R-squared       =    0.2039
-------------+----------------------------------   Adj R-squared   =    0.2037
       Total |  111.203474    14,641  .007595347   Root MSE        =    .07777

--------------------------------------------------------------------------------------
          vote_share | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
---------------------+----------------------------------------------------------------
      1.ever_treated |   .0280523   .0149946     1.87   0.061    -.0013389    .0574435
           1.dummyP2 |  -.0784136   .0012878   -60.89   0.000    -.0809378   -.0758893
                     |
ever_treated#dummyP2 |
                1 1  |   .0790281   .0212055     3.73   0.000     .0374626    .1205935
                     |
               _cons |   .2259089   .0009106   248.09   0.000      .224124    .2276938
--------------------------------------------------------------------------------------

Why does this happen? Thank you.

Tags: csdid

FernandoRios

Join Date: Apr 2014

Posts: 2480
#2

04 Nov 2023, 16:49

Two reasons
1. Csdid cluster standard errors at the individial
level when using panel data
2. does not incorporate any degrees of freedom adjustment
1 like
Comment

Announcement

CSDID and DID regressions produce different standard errors

Comment