I am trying to understand how -xthdidregress ra- works in StataSE 18. I successfully reproduce the coefficients "by hand" without covariates. However, when I add covariate(s), the estimates do not match. Why is this?
Reproducible example below using a panel of three states from 2001 to 2005. State 1's treatment begins in 2003. The estimates match do not match when I include the -jobs- covariate. (Note: I use the notation provided on p. 17 of the -xthdidregress- help file.)
Setup
WITHOUT covariates
WITH -jobs- covariate
Below uses the same code, except I include -jobs- in -xthdidregress ra- and the regression for -m_g,t(x)-.
Reproducible example below using a panel of three states from 2001 to 2005. State 1's treatment begins in 2003. The estimates match do not match when I include the -jobs- covariate. (Note: I use the notation provided on p. 17 of the -xthdidregress- help file.)
Setup
Code:
clear all input state year gdp post2003 treatmentGroup treated jobs 1 2001 100 0 1 0 329 1 2002 115 0 1 0 203 1 2003 95 1 1 1 215 1 2004 87 1 1 1 151 1 2005 73 1 1 1 120 2 2001 113 0 0 0 415 2 2002 117 0 0 0 417 2 2003 121 1 0 0 425 2 2004 125 1 0 0 429 2 2005 129 1 0 0 437 3 2001 47 0 0 0 143 3 2002 53 0 0 0 142 3 2003 59 1 0 0 149 3 2004 62 1 0 0 152 3 2005 66 1 0 0 155 end *Set panel xtset state year, yearly
Code:
. ///*** Stata Command ***///
. xthdidregress ra (gdp) (treated), group(state)
note: variable _did_cohort, containing cohort indicators formed by treatment variable treated and group variable state, was added to the dataset.
Computing ATET for each cohort and time:
Cohort 2003 (4): .... done
Treatment and time information
Time variable: year
Time interval: 2001 to 2005
Control: _did_cohort = 0
Treatment: _did_cohort > 0
-------------------------------
| _did_cohort
------------------+------------
Number of cohorts | 2
------------------+------------
Number of obs |
Never treated | 10
2003 | 5
-------------------------------
Heterogeneous-treatment-effects regression Number of obs = 15
Number of panels = 3
Estimator: Regression adjustment
Panel variable: state
Treatment level: state
Control group: Never treated
(Std. err. adjusted for 3 clusters in state)
------------------------------------------------------------------------------
| Robust
Cohort | ATET std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
year |
2002 | 10 .7071068 14.14 0.000 8.614096 11.3859
2003 | -25 .7071068 -35.36 0.000 -26.3859 -23.6141
2004 | -36.5 .3535534 -103.24 0.000 -37.19295 -35.80705
2005 | -54.5 .3535534 -154.15 0.000 -55.19295 -53.80705
------------------------------------------------------------------------------
.
. ///*** Attempt to Reproduce ***///
. *y_t
. gen y_t = gdp
.
. *y_g-1
. gen y_2002 = gdp if year==2002
(12 missing values generated)
. bysort state (y_2002): replace y_2002 = y_2002[1]
(12 real changes made)
.
. *y_t - y_g-1
. gen dy = y_t - y_2002
.
.
. *m_g,t(x)
. reg dy ib2001.year if treatmentGroup==0
Source | SS df MS Number of obs = 10
-------------+---------------------------------- F(4, 5) = 95.15
Model | 380.6 4 95.15 Prob > F = 0.0001
Residual | 5 5 1 R-squared = 0.9870
-------------+---------------------------------- Adj R-squared = 0.9767
Total | 385.6 9 42.8444444 Root MSE = 1
------------------------------------------------------------------------------
dy | Coefficient Std. err. t P>|t| [95% conf. interval]
-------------+----------------------------------------------------------------
year |
2002 | 5 1 5.00 0.004 2.429418 7.570582
2003 | 10 1 10.00 0.000 7.429418 12.57058
2004 | 13.5 1 13.50 0.000 10.92942 16.07058
2005 | 17.5 1 17.50 0.000 14.92942 20.07058
|
_cons | -5 .7071068 -7.07 0.001 -6.817676 -3.182324
------------------------------------------------------------------------------
. predict mhat, xb
.
. *Predicted ATET
. egen atet = mean(y_t - y_2002 - m) if treated==1, by(year)
(12 missing values generated)
. tab atet year
| year
atet | 2003 2004 2005 | Total
-----------+---------------------------------+----------
-54.5 | 0 0 1 | 1
-36.5 | 0 1 0 | 1
-25 | 1 0 0 | 1
-----------+---------------------------------+----------
Total | 1 1 1 | 3
Below uses the same code, except I include -jobs- in -xthdidregress ra- and the regression for -m_g,t(x)-.
Code:
. ///*** Stata Command ***///
. xthdidregress ra (gdp jobs) (treated), group(state)
note: variable _did_cohort, containing cohort indicators formed by treatment variable treated and group variable state, was added to the dataset.
Computing ATET for each cohort and time:
Cohort 2003 (4): .... done
Treatment and time information
Time variable: year
Time interval: 2001 to 2005
Control: _did_cohort = 0
Treatment: _did_cohort > 0
-------------------------------
| _did_cohort
------------------+------------
Number of cohorts | 2
------------------+------------
Number of obs |
Never treated | 10
2003 | 5
-------------------------------
Heterogeneous-treatment-effects regression Number of obs = 15
Number of panels = 3
Estimator: Regression adjustment
Panel variable: state
Treatment level: state
Control group: Never treated
(Std. err. adjusted for 3 clusters in state)
------------------------------------------------------------------------------
| Robust
Cohort | ATET std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
year |
2002 | 10.36765 1.78e-15 5.8e+15 0.000 10.36765 10.36765
2003 | -25.55636 3.55e-15 -7.2e+15 0.000 -25.55636 -25.55636
2004 | -36.77818 . . . . .
2005 | -54.77818 . . . . .
------------------------------------------------------------------------------
Note: ATET computed using covariates.
.
. ///*** Attempt to Reproduce ***///
. *y_t
. gen y_t = gdp
.
. *y_g-1
. gen y_2002 = gdp if year==2002
(12 missing values generated)
. bysort state (y_2002): replace y_2002 = y_2002[1]
(12 real changes made)
.
. *y_t - y_g-1
. gen dy = y_t - y_2002
.
.
. *m_g,t(x)
. reg dy ib2001.year jobs if treatmentGroup==0
Source | SS df MS Number of obs = 10
-------------+---------------------------------- F(5, 4) = 66.56
Model | 381.020755 5 76.2041511 Prob > F = 0.0006
Residual | 4.57924473 4 1.14481118 R-squared = 0.9881
-------------+---------------------------------- Adj R-squared = 0.9733
Total | 385.6 9 42.8444444 Root MSE = 1.07
------------------------------------------------------------------------------
dy | Coefficient Std. err. t P>|t| [95% conf. interval]
-------------+----------------------------------------------------------------
year |
2002 | 5.000742 1.069959 4.67 0.009 2.030059 7.971425
2003 | 10.01187 1.070138 9.36 0.001 7.040695 12.98305
2004 | 13.51707 1.070329 12.63 0.000 10.54536 16.48878
2005 | 17.52523 1.070768 16.37 0.000 14.5523 20.49816
|
jobs | -.0014841 .0024481 -0.61 0.577 -.0082812 .0053129
_cons | -4.585923 1.019275 -4.50 0.011 -7.415883 -1.755963
------------------------------------------------------------------------------
. predict mhat, xb
.
. *Predicted ATET
. egen atet = mean(y_t - y_2002 - m) if treated==1, by(year)
(12 missing values generated)
. tab atet year
| year
atet | 2003 2004 2005 | Total
-----------+---------------------------------+----------
-54.76121 | 0 0 1 | 1
-36.70704 | 0 1 0 | 1
-25.10686 | 1 0 0 | 1
-----------+---------------------------------+----------
Total | 1 1 1 | 3

Comment