Hello Stata community,
I am experiencing a perfect collinearity issue in a triple difference regression. Specifically, the triple difference estimator is perfectly collinear with one of the double interaction variables, and I'm not sure why.
I am analyzing panel data from 2001-2014, where unit of analysis is university/year (n=31,182). I am interacting the following three variables: type (e.g. public vs. private university), treatment (received at the state-level), and post-period (which begins at different times for each of the treated states). I am using Stata 14.1 on Windows 10.
My code and output are as follows:
*Generate three individual dummy variables*
gen Type = 0
replace Type = 1 if campus_type=="Public"
gen Treat = 0
replace Treat = 1 if state=="CO"
replace Treat = 1 if state=="ID"
replace Treat = 1 if state=="KS"
replace Treat = 1 if state=="MS"
replace Treat = 1 if state=="OR"
replace Treat = 1 if state=="UT"
replace Treat = 1 if state=="WI"
gen Post = 0
replace Post = 1 if state=="CO" & year>=2011
replace Post = 1 if state=="MS" & year>=2012
replace Post = 1 if state=="KS" & year>=2014
replace Post = 1 if state=="OR" & year>=2012
replace Post = 1 if state=="UT" & year>=2007
replace Post = 1 if state=="WI" & year>=2012
*Generate interaction variables*
gen TypeXTreat = (Type*Treat)
gen TypeXPost = (Type*Post)
gen TreatxPost = (Treat*Post)
gen Triplediff = (Type*Treat*Post)
*Conduct regression*
areg rate_y Treat Post Type TypeXTreat TypeXPost TreatxPost Triplediff, absorb(year) r
Linear regression, absorbing indicators Number of obs = 31,183
F( 5, 31164) = 8.46
Prob > F = 0.0000
R-squared = 0.0005
Adj R-squared = -0.0001
Root MSE = 27.8297
----------------------------------------------------------------------------------------------------------------
| Robust
rate_y | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+-------------------------------------------------------------------------------------------------
Treat | -.781114 .3308301 -2.36 0.018 -1.429554 -.1326737
Post | -.1396623 .3242237 -0.43 0.667 -.7751538 .4958292
Type | -.991002 .2478075 -4.00 0.000 -1.476715 -.5052894
TypeXTreat | .6589614 .3017063 2.18 0.029 .067605 1.250318
TypeXPost | .2637135 .1993894 1.32 0.186 -.1270977 .6545247
TreatxPost | 0 (omitted)
Triplediff | 0 (omitted)
_cons | 1.260263 .2462696 5.12 0.000 .7775652 1.742962
-------------+--------------------------------------------------------------------------------------------------
year | absorbed (14 categories)
I can remove the collinearity problem by muting one of the assignments of the Treat variable. For example, changing one line in the above code to the following fixes the problem:
*replace Treat = 1 if state=="CO" // (e.g. mute this line of code, no more collinearity)
Maybe I've been staring at the data too long, but I don't understand why there is a collinearity problem before muting one such line. Failing to code that variable means my model does not represent reality, so I'd like to find a way around the issue if possible. Thanks in advance for any insights here.
Sincerely,
Jon
I am experiencing a perfect collinearity issue in a triple difference regression. Specifically, the triple difference estimator is perfectly collinear with one of the double interaction variables, and I'm not sure why.
I am analyzing panel data from 2001-2014, where unit of analysis is university/year (n=31,182). I am interacting the following three variables: type (e.g. public vs. private university), treatment (received at the state-level), and post-period (which begins at different times for each of the treated states). I am using Stata 14.1 on Windows 10.
My code and output are as follows:
*Generate three individual dummy variables*
gen Type = 0
replace Type = 1 if campus_type=="Public"
gen Treat = 0
replace Treat = 1 if state=="CO"
replace Treat = 1 if state=="ID"
replace Treat = 1 if state=="KS"
replace Treat = 1 if state=="MS"
replace Treat = 1 if state=="OR"
replace Treat = 1 if state=="UT"
replace Treat = 1 if state=="WI"
gen Post = 0
replace Post = 1 if state=="CO" & year>=2011
replace Post = 1 if state=="MS" & year>=2012
replace Post = 1 if state=="KS" & year>=2014
replace Post = 1 if state=="OR" & year>=2012
replace Post = 1 if state=="UT" & year>=2007
replace Post = 1 if state=="WI" & year>=2012
*Generate interaction variables*
gen TypeXTreat = (Type*Treat)
gen TypeXPost = (Type*Post)
gen TreatxPost = (Treat*Post)
gen Triplediff = (Type*Treat*Post)
*Conduct regression*
areg rate_y Treat Post Type TypeXTreat TypeXPost TreatxPost Triplediff, absorb(year) r
Linear regression, absorbing indicators Number of obs = 31,183
F( 5, 31164) = 8.46
Prob > F = 0.0000
R-squared = 0.0005
Adj R-squared = -0.0001
Root MSE = 27.8297
----------------------------------------------------------------------------------------------------------------
| Robust
rate_y | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+-------------------------------------------------------------------------------------------------
Treat | -.781114 .3308301 -2.36 0.018 -1.429554 -.1326737
Post | -.1396623 .3242237 -0.43 0.667 -.7751538 .4958292
Type | -.991002 .2478075 -4.00 0.000 -1.476715 -.5052894
TypeXTreat | .6589614 .3017063 2.18 0.029 .067605 1.250318
TypeXPost | .2637135 .1993894 1.32 0.186 -.1270977 .6545247
TreatxPost | 0 (omitted)
Triplediff | 0 (omitted)
_cons | 1.260263 .2462696 5.12 0.000 .7775652 1.742962
-------------+--------------------------------------------------------------------------------------------------
year | absorbed (14 categories)
I can remove the collinearity problem by muting one of the assignments of the Treat variable. For example, changing one line in the above code to the following fixes the problem:
*replace Treat = 1 if state=="CO" // (e.g. mute this line of code, no more collinearity)
Maybe I've been staring at the data too long, but I don't understand why there is a collinearity problem before muting one such line. Failing to code that variable means my model does not represent reality, so I'd like to find a way around the issue if possible. Thanks in advance for any insights here.
Sincerely,
Jon
Comment