Hello. I am using the staggered DID model (Callaway and Sant'Anna 2021) using the csdid package and I am comparing its results to several DID regressions.
I don't understand why DID regressions and the csdid command produce different standard errors.
Here are the results of the csdid command:
I now try to imitate "g4 t_2_3" using a normal DID regression.
I import a different dataset extrapolated for the dataset used with the csdid command. This new dataset only has observations in periods 2 and 3 for those municipalities treated in period 4 for the first time.
I first delete unbalanced observations as the csdid command would do:
Then create a dummy for each of the two periods
And then run the DID regression:
The resulting coefficient is the same: .0564394
But standard errors are different: .0009929 in the csdid, .1253961 in the did reg.
This always happens. Csdid's SEs are always different, either smaller (like in this case) or bigger.
For example, in the g2 t_1_4, csdid gives a bigger SE: .0241041
The DID' regression' SE is .0212055, smaller.
Why does this happen? Thank you.
I don't understand why DID regressions and the csdid command produce different standard errors.
Here are the results of the csdid command:
Code:
. encode comune, gen(idcomune) . csdid vote_share, ivar(idcomune) time(period) gvar(first_treated) reg Units always treated found. These will be ignored Panel is not balanced Will use observations with Pair balanced (observed at t0 and t1) ................ Difference-in-difference with Multiple Time Periods Number of obs = 38,042 Outcome model : regression adjustment Treatment model: none ------------------------------------------------------------------------------ | Coefficient Std. err. z P>|z| [95% conf. interval] -------------+---------------------------------------------------------------- g2 | t_1_2 | .0161844 .0065602 2.47 0.014 .0033267 .0290421 t_1_3 | .079245 .022349 3.55 0.000 .0354418 .1230483 t_1_4 | .0790281 .0241041 3.28 0.001 .0317849 .1262712 t_1_5 | .0617656 .0208679 2.96 0.003 .0208653 .1026658 -------------+---------------------------------------------------------------- g3 | t_1_2 | .0210849 .004212 5.01 0.000 .0128295 .0293402 t_2_3 | .040359 .0082862 4.87 0.000 .0241184 .0565996 t_2_4 | .0162996 .0067423 2.42 0.016 .003085 .0295142 t_2_5 | .0303758 .0085279 3.56 0.000 .0136614 .0470902 -------------+---------------------------------------------------------------- g4 | t_1_2 | -.0439547 .0005086 -86.42 0.000 -.0449515 -.0429578 t_2_3 | .0564394 .0009929 56.85 0.000 .0544934 .0583853 t_3_4 | -.0692106 .0005165 -134.00 0.000 -.0702229 -.0681983 t_3_5 | -.0485215 .0006746 -71.92 0.000 -.0498437 -.0471992 -------------+---------------------------------------------------------------- g5 | t_1_2 | -.014802 .0034881 -4.24 0.000 -.0216386 -.0079655 t_2_3 | -.0067809 .0046161 -1.47 0.142 -.0158283 .0022665 t_3_4 | .000345 .0074088 0.05 0.963 -.014176 .014866 t_4_5 | .0039694 .0074588 0.53 0.595 -.0106496 .0185884 ------------------------------------------------------------------------------ Control: Never Treated See Callaway and Sant'Anna (2021) for details
I import a different dataset extrapolated for the dataset used with the csdid command. This new dataset only has observations in periods 2 and 3 for those municipalities treated in period 4 for the first time.
I first delete unbalanced observations as the csdid command would do:
Code:
. egen var1 = count(vote_share), by(comune) . keep if var1==2 (614 observations deleted)
Code:
. tab period, gen(dummyP) period | | Freq. Percent Cum. ------------+----------------------------------- 2 | 7,462 50.00 50.00 3 | 7,462 50.00 100.00 ------------+----------------------------------- Total | 14,924 100.00
Code:
. reg vote_share ever_treated##dummyP2 Source | SS df MS Number of obs = 14,924 -------------+---------------------------------- F(3, 14920) = 1727.22 Model | 40.7331805 3 13.5777268 Prob > F = 0.0000 Residual | 117.286738 14,920 .007861041 R-squared = 0.2578 -------------+---------------------------------- Adj R-squared = 0.2576 Total | 158.019919 14,923 .010589018 Root MSE = .08866 -------------------------------------------------------------------------------------- vote_share | Coefficient Std. err. t P>|t| [95% conf. interval] ---------------------+---------------------------------------------------------------- 1.ever_treated | .039173 .0886685 0.44 0.659 -.1346281 .2129741 1.dummyP2 | .1044656 .0014516 71.96 0.000 .1016202 .1073109 | ever_treated#dummyP2 | 1 1 | .0564394 .1253961 0.45 0.653 -.1893525 .3022312 | _cons | .1794675 .0010265 174.84 0.000 .1774555 .1814795 --------------------------------------------------------------------------------------
But standard errors are different: .0009929 in the csdid, .1253961 in the did reg.
This always happens. Csdid's SEs are always different, either smaller (like in this case) or bigger.
For example, in the g2 t_1_4, csdid gives a bigger SE: .0241041
The DID' regression' SE is .0212055, smaller.
Code:
. reg vote_share ever_treated##dummyP2 Source | SS df MS Number of obs = 14,642 -------------+---------------------------------- F(3, 14638) = 1249.40 Model | 22.6698886 3 7.55662952 Prob > F = 0.0000 Residual | 88.5335856 14,638 .006048202 R-squared = 0.2039 -------------+---------------------------------- Adj R-squared = 0.2037 Total | 111.203474 14,641 .007595347 Root MSE = .07777 -------------------------------------------------------------------------------------- vote_share | Coefficient Std. err. t P>|t| [95% conf. interval] ---------------------+---------------------------------------------------------------- 1.ever_treated | .0280523 .0149946 1.87 0.061 -.0013389 .0574435 1.dummyP2 | -.0784136 .0012878 -60.89 0.000 -.0809378 -.0758893 | ever_treated#dummyP2 | 1 1 | .0790281 .0212055 3.73 0.000 .0374626 .1205935 | _cons | .2259089 .0009106 248.09 0.000 .224124 .2276938 --------------------------------------------------------------------------------------
Comment