I am trying to understand how -xthdidregress ra- works in StataSE 18. I successfully reproduce the coefficients "by hand" without covariates. However, when I add covariate(s), the estimates do not match. Why is this?
Reproducible example below using a panel of three states from 2001 to 2005. State 1's treatment begins in 2003. The estimates match do not match when I include the -jobs- covariate. (Note: I use the notation provided on p. 17 of the -xthdidregress- help file.)
Setup
WITHOUT covariates
WITH -jobs- covariate
Below uses the same code, except I include -jobs- in -xthdidregress ra- and the regression for -m_g,t(x)-.
Reproducible example below using a panel of three states from 2001 to 2005. State 1's treatment begins in 2003. The estimates match do not match when I include the -jobs- covariate. (Note: I use the notation provided on p. 17 of the -xthdidregress- help file.)
Setup
Code:
clear all input state year gdp post2003 treatmentGroup treated jobs 1 2001 100 0 1 0 329 1 2002 115 0 1 0 203 1 2003 95 1 1 1 215 1 2004 87 1 1 1 151 1 2005 73 1 1 1 120 2 2001 113 0 0 0 415 2 2002 117 0 0 0 417 2 2003 121 1 0 0 425 2 2004 125 1 0 0 429 2 2005 129 1 0 0 437 3 2001 47 0 0 0 143 3 2002 53 0 0 0 142 3 2003 59 1 0 0 149 3 2004 62 1 0 0 152 3 2005 66 1 0 0 155 end *Set panel xtset state year, yearly
Code:
. ///*** Stata Command ***/// . xthdidregress ra (gdp) (treated), group(state) note: variable _did_cohort, containing cohort indicators formed by treatment variable treated and group variable state, was added to the dataset. Computing ATET for each cohort and time: Cohort 2003 (4): .... done Treatment and time information Time variable: year Time interval: 2001 to 2005 Control: _did_cohort = 0 Treatment: _did_cohort > 0 ------------------------------- | _did_cohort ------------------+------------ Number of cohorts | 2 ------------------+------------ Number of obs | Never treated | 10 2003 | 5 ------------------------------- Heterogeneous-treatment-effects regression Number of obs = 15 Number of panels = 3 Estimator: Regression adjustment Panel variable: state Treatment level: state Control group: Never treated (Std. err. adjusted for 3 clusters in state) ------------------------------------------------------------------------------ | Robust Cohort | ATET std. err. z P>|z| [95% conf. interval] -------------+---------------------------------------------------------------- year | 2002 | 10 .7071068 14.14 0.000 8.614096 11.3859 2003 | -25 .7071068 -35.36 0.000 -26.3859 -23.6141 2004 | -36.5 .3535534 -103.24 0.000 -37.19295 -35.80705 2005 | -54.5 .3535534 -154.15 0.000 -55.19295 -53.80705 ------------------------------------------------------------------------------ . . ///*** Attempt to Reproduce ***/// . *y_t . gen y_t = gdp . . *y_g-1 . gen y_2002 = gdp if year==2002 (12 missing values generated) . bysort state (y_2002): replace y_2002 = y_2002[1] (12 real changes made) . . *y_t - y_g-1 . gen dy = y_t - y_2002 . . . *m_g,t(x) . reg dy ib2001.year if treatmentGroup==0 Source | SS df MS Number of obs = 10 -------------+---------------------------------- F(4, 5) = 95.15 Model | 380.6 4 95.15 Prob > F = 0.0001 Residual | 5 5 1 R-squared = 0.9870 -------------+---------------------------------- Adj R-squared = 0.9767 Total | 385.6 9 42.8444444 Root MSE = 1 ------------------------------------------------------------------------------ dy | Coefficient Std. err. t P>|t| [95% conf. interval] -------------+---------------------------------------------------------------- year | 2002 | 5 1 5.00 0.004 2.429418 7.570582 2003 | 10 1 10.00 0.000 7.429418 12.57058 2004 | 13.5 1 13.50 0.000 10.92942 16.07058 2005 | 17.5 1 17.50 0.000 14.92942 20.07058 | _cons | -5 .7071068 -7.07 0.001 -6.817676 -3.182324 ------------------------------------------------------------------------------ . predict mhat, xb . . *Predicted ATET . egen atet = mean(y_t - y_2002 - m) if treated==1, by(year) (12 missing values generated) . tab atet year | year atet | 2003 2004 2005 | Total -----------+---------------------------------+---------- -54.5 | 0 0 1 | 1 -36.5 | 0 1 0 | 1 -25 | 1 0 0 | 1 -----------+---------------------------------+---------- Total | 1 1 1 | 3
Below uses the same code, except I include -jobs- in -xthdidregress ra- and the regression for -m_g,t(x)-.
Code:
. ///*** Stata Command ***/// . xthdidregress ra (gdp jobs) (treated), group(state) note: variable _did_cohort, containing cohort indicators formed by treatment variable treated and group variable state, was added to the dataset. Computing ATET for each cohort and time: Cohort 2003 (4): .... done Treatment and time information Time variable: year Time interval: 2001 to 2005 Control: _did_cohort = 0 Treatment: _did_cohort > 0 ------------------------------- | _did_cohort ------------------+------------ Number of cohorts | 2 ------------------+------------ Number of obs | Never treated | 10 2003 | 5 ------------------------------- Heterogeneous-treatment-effects regression Number of obs = 15 Number of panels = 3 Estimator: Regression adjustment Panel variable: state Treatment level: state Control group: Never treated (Std. err. adjusted for 3 clusters in state) ------------------------------------------------------------------------------ | Robust Cohort | ATET std. err. z P>|z| [95% conf. interval] -------------+---------------------------------------------------------------- year | 2002 | 10.36765 1.78e-15 5.8e+15 0.000 10.36765 10.36765 2003 | -25.55636 3.55e-15 -7.2e+15 0.000 -25.55636 -25.55636 2004 | -36.77818 . . . . . 2005 | -54.77818 . . . . . ------------------------------------------------------------------------------ Note: ATET computed using covariates. . . ///*** Attempt to Reproduce ***/// . *y_t . gen y_t = gdp . . *y_g-1 . gen y_2002 = gdp if year==2002 (12 missing values generated) . bysort state (y_2002): replace y_2002 = y_2002[1] (12 real changes made) . . *y_t - y_g-1 . gen dy = y_t - y_2002 . . . *m_g,t(x) . reg dy ib2001.year jobs if treatmentGroup==0 Source | SS df MS Number of obs = 10 -------------+---------------------------------- F(5, 4) = 66.56 Model | 381.020755 5 76.2041511 Prob > F = 0.0006 Residual | 4.57924473 4 1.14481118 R-squared = 0.9881 -------------+---------------------------------- Adj R-squared = 0.9733 Total | 385.6 9 42.8444444 Root MSE = 1.07 ------------------------------------------------------------------------------ dy | Coefficient Std. err. t P>|t| [95% conf. interval] -------------+---------------------------------------------------------------- year | 2002 | 5.000742 1.069959 4.67 0.009 2.030059 7.971425 2003 | 10.01187 1.070138 9.36 0.001 7.040695 12.98305 2004 | 13.51707 1.070329 12.63 0.000 10.54536 16.48878 2005 | 17.52523 1.070768 16.37 0.000 14.5523 20.49816 | jobs | -.0014841 .0024481 -0.61 0.577 -.0082812 .0053129 _cons | -4.585923 1.019275 -4.50 0.011 -7.415883 -1.755963 ------------------------------------------------------------------------------ . predict mhat, xb . . *Predicted ATET . egen atet = mean(y_t - y_2002 - m) if treated==1, by(year) (12 missing values generated) . tab atet year | year atet | 2003 2004 2005 | Total -----------+---------------------------------+---------- -54.76121 | 0 0 1 | 1 -36.70704 | 0 1 0 | 1 -25.10686 | 1 0 0 | 1 -----------+---------------------------------+---------- Total | 1 1 1 | 3
Comment