Generalized difference in differences problem

Carlos Noyola

Join Date: Feb 2019

Posts: 13
#1

Generalized difference in differences problem

23 Apr 2019, 20:06

I am evaluating a government program using difference in differences. The program was implemented in 2014 and I have administrative records for the outcome variable from 2010 to 2018 (except for 2014). First I evaluated the program using only 2 years: 2013 (before the program) and 2015 (after). My regression looks like this

Code:

regress score treatment post post##treatment X

where post indicates if the observation is from 2015 and X is my control vector. Now I want to include all years, so first I created a dummy for after the program like this:

Code:

gen byte post=(dummy2015==1 | dummy2016==1 | dummy2017==1 | dummy2018==1)

and I was about to run a regression like the one before but including interaction terms of treatment and dummy for every year, post and dummy for every year and treatment post and dumy for every year, but I noticed that, for example, for all years before the treatment, the interaction between post and the year dummy does not make sense, since an observation from 2013, 2012, 2011 or 2010 will never have one in post (by definition), and then there might be some other problems of collinearity with the rest of the variables, or not?
How should I be running this regression? I think I am asking for the specification of difference in differences with multiple periods

Second, I read in Mostly Harmless Econometrics that a good way to test the identification is using a Granger test, and it says that the test consists of making sure that leads do not matter in an equation that contains interactions of treatment and dummies for years before the program and then interactions of treatment, years after the program and control variables (triple interactions), what does this mean?

Finally, is that really a good way of testing the identification? If not, could you tell me, or give me a reference where I can find out how to test my identification strategy?

Last edited by Carlos Noyola; 23 Apr 2019, 20:18.
Tags: diff in diff, difference in difference, difference-in-difference, difference-in-differences, generalized diff in diff
Tom Bilach

Join Date: Sep 2018

Posts: 16
#2

02 Jun 2019, 16:40

The interaction between the "Post" variable and the post-treatment dummies would not make sense in the regression you specified. Instead, you should drop the "Post" variable entirely and interact the treatment variable with your newly created year dummies. This could tell you how your outcome is changing post-program exposure. In other words, you are no longer assuming a constant effect of treatment once it takes effect.

Including all years (i.e., year dummies for 2010 through 2018) in a regression framework is equivalent to incorporating "time" fixed-effects and would introduce collinearity problems should you also include the "Post" variable in your model. Stata will ultimately drop another year to allow for the model to be estimated. The "Post" variable should index all years after the program commences (i.e., 2015 onward). Is the program only in effect for one year? Also, does it turn off and back on again? If the program continues without interruption, then you can proceed with the classical approach.

It should be noted that a standardized post-treatment dummy will work in your setting if all treated units received the program/treatment at the same time. If different units received the treatment at different times, then you can't use the classical approach as indicated in your code. Instead, you need to adopt the more "generalized" DD approach.

Clyde also addresses collinearity problems of this sort very well in another post (see below):

https://www.statalist.org/forums/for...ferences-model

The more general DD approach incorporates unit fixed-effects, time fixed-effects, and a dummy indexing treated units in post-treatment years. You require this setup to address any dynamic effects of treatment. That being said, when you incorporate a lead in your model, you ultimately don't want to see program effects before the program actually begins.

Hope this helps!
Comment

Announcement

Generalized difference in differences problem

Comment