D-i-D analysis in panel data

Iván Higuera Mendieta

Join Date: Oct 2014

Posts: 28
#1

D-i-D analysis in panel data

24 Oct 2014, 18:06

Hi to all Stata experts,

I am currently new to Stata and I am trying to make a differences-in-differences analysis using electoral panel data within two years (1997 and 2000) where the treatment takes place in 2000 over a set of municipalities. I used the following code:

Code:

xtreg id year xtreg yvar treatment year=2000 treatment*year [a set of controls], fe

My idea is having fixed effects by municipality. Nevertheless, Stata tells me that my treatment variable is omitted because of collinearity and that is problematic since I need that variable to assure the validity of the interaction and finally of my treatment. There is something I am missing in the code or in my identification strategy? I have the same problem even dropping the set of controls.

Thanks in advance,

Iván Higuera
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30100
#2

24 Oct 2014, 18:17

First, you should always show the exact code that you ran and what results you got. I do not believe what you posted is the code you ran. If nothing else you can't have year=2000 in the xtreg statement, nor is treatment*year valid syntax in that context. Also, xtreg id year makes no sense at all (though it is syntactically valid.)

I'm guessing you meant something like this:

Code:

xtset id year xtreg yvar i.treatment##i.year, fe

and Stata threw out treatment beause of collinearity.

As well it should. The treatment variable is constant within id, so it is collinear with the fixed effect and a separate effect cannot be estimated.

But you are not actually interested in the effect of treatment. In a differences in differences analysis, the "treatment effect" is actually estimated by the coefficient of the treatment#year interaction term. And this kind of fixed-effects model is one of the few circumstances where a regression model that includes an interaction but omits one of the main effects is well-specified.
Comment
Iván Higuera Mendieta

Join Date: Oct 2014

Posts: 28
#3

24 Oct 2014, 18:40

First, thank you for your prompt response. I am sorry for my informality, I am new to Statalist as well. Second, I know that the interaction variable is the relevant estimator for this matter, but if one of the terms of the interaction is lost (due to collinearity) or is statistically non-significant does not it make the interaction non-significant as well?

Code:

*Defining treatment* gen treatment=1 if id==50350 | id==50370 | id==50330 | id==50711 | id==18753 replace caguan=0 if caguan==. *Year indicator* gen y2000 = (year==2000) *Interaction* gen y2000_treatment = y2000 * treatment *Regression* xtset id year xtreg yvar treatment y2000 y2000_treatment [set of controls], fe

Thanks again,

Iván Higuera
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30100
#4

24 Oct 2014, 18:56

...but if one of the terms of the interaction is lost (due to collinearity) or is statistically non-significant does not it make the interaction non-significant as well?

No. Even outside this particular context that's not true. And it's certainly not true in the DinD context. Suppose you had used a random effects model instead of fixed-effects. Stata would not complain about collinearity in that case. You would get separate estimates for treatment, year, and treatment#year. But the naming of those effects is completely misleading. In this context, the coefficient of treatment is not a "treatment effect," it is instead the expected difference of the outcome variable in the two arms (treatment arm - control arm) in the base year only. In fact, if your treatment and control groups were randomly assigned, you would expect it to be zero, and you would have some concerns about the adequacy of your randomization if it were not close to zero. In an observational study, of course, one has less confidence that the expected difference in the two groups prior to exposure would be zero--but even there, if the two groups really differed greatly at baseline that would be cause to worry about the applicability of a DinD model as well. So, in general, one expects the coefficient of the treatment variable to be close to zero and one prefers it to be not statistically significant when doing this kind of model.

In the fixed effects model, as previously noted, this all gets washed out anyway because the "treatment" effect is completely confounded with the fixed effect and gets dropped from the model; only the effects of things that vary within subjects are estimated. This is absolutely no cause for concern whatsoever.
Comment
Iván Higuera Mendieta

Join Date: Oct 2014

Posts: 28
#5

25 Oct 2014, 09:05

Thank you very much for your explanation

Iván Higuera
Comment

Announcement

D-i-D analysis in panel data

Comment

Comment

Comment

Comment