Difference in Differences model with more than two periods of treatment in panel data

Vaggelis Ktenas

Join Date: Nov 2018

Posts: 16
#1

Difference in Differences model with more than two periods of treatment in panel data

04 Nov 2018, 13:11

Dear Statalists,

I have problem with did regression with more than two periods of treatment in panel data. Firstly i think that i can't understand the use of this method: if we want to check the heterogeneity across years we just use a FE model and control of the dummy of treatment, if we want to see the differences in target variable across and between we use a RE with the variable treatment as an independent variable. In multiple time periods of treatment why we should use did approach? Moreover, how i should create the dummy variable time of treatment when i generated treatment=1 when a company "had a treatment" for at least one year. So in groups all companies have value=1 for the variable treatment (or all have treat=0) whereas the treatment started in specific times between companies?

For example:
y= a+b*time+b2treat+dtime*treat+e
for the first company i have four years observations, treat=1 for all years and the treatment started in the 4th year. time is then the vector (0,0,0,1)?

Furthermore i should also add year dummies in the model?

Thank you very much
Tags: None
Vaggelis Ktenas

Join Date: Nov 2018

Posts: 16
#2

05 Nov 2018, 08:07

I made a mistake.. Actually i can understand that the treatment variable determines the time variable. In the above question it is not a right design of treatment variable, but anyway my misunderstanding is when i have unbalanced data (max is 5 years observations) and

for company A: treat= (1,1,0,0,0) by year, for company B treat=(0,0,1,1), for company C treat=(1,1,1,1,0)
so treatment happens at different periods for each company. In my example time=(1,1,1,1,0) ?
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30118
#3

05 Nov 2018, 08:40

Your second post makes things a lot clearer. This type of data is not suitable for the classical difference-in-differences analysis. It should be analyzed using generalized difference-in-differences. You can find an overview of the technique at https://www.ipr.northwestern.edu/wor.../Day%204.2.pdf.

You will not have a pre-post variable in the model. You need a variable that takes on the value 1 in any observation whose company is receiving the treatment in that year, and 0 otherwise. If I understand what you say in #2 correctly, the variable you call treat is precisely this variable. What is crucial is that you must also include both company and year fixed-effects. So your code will look something like this:

Code:

xtset company year xtreg outcome i.treat i.year, fe

You may want to also include some covariates in your regression, and you might want to consider clustering the standard errors on company. Those are separate decisions that I can't comment on with the information provided so far.

The coefficient you get for 1.treat after running the above is the generalized DID estimator of the treatment effect.
Comment
Vaggelis Ktenas

Join Date: Nov 2018

Posts: 16
#4

05 Nov 2018, 09:05

Dear Clyde,

Thank you for the answer.

I indeed want to add covariates in my regression. I also have a lot of companies/panels (2110). I read in the attached link that if there are a lot of clusters i should use cluster(id), is that right ? What do you mean by "those are separate decisions"? That is independent the one of adding covariates from clustering? Last but not least, the model you proposed of G DID is the same like a FE model added both effects of year and company with a dummy of treatment as independent and other continuous factors. So i can see the differences in the outcome across years, within companies and i also control for years too. Is there any difference between Generalized DID and FE model with a dummy of treatment?
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30118
#5

05 Nov 2018, 09:11

I indeed want to add covariates in my regression. I also have a lot of companies/panels (2110). I read in the attached link that if there are a lot of clusters i should use cluster(id), is that right ? What do you mean by "those are separate decisions"?

They are separate decision in the sense that they depend on what the variables in your study are and mean, and how many clusters you have. The decision to include panel and time fixed effects does not depend on those things and must be done regardless. Also the decision to add covariates depends on whether or not there is reason to be concerned about measurable variables that are associated with the outcome and are not constant within panel, whereas the decision to cluster the standard errors is mostly dependent on whether you have enough clusters to do that. So these two decisions are separate from each other and also separate from the overall design of the generalized DID model.

Is there any difference between Generalized DID and FE model with a dummy of treatment?

Yes. An FE model with a dummy of treatment would ordinarily not include year indicators. The Generalized DID model must include the year indicators or it will not be valid.
1 like
Comment
Vaggelis Ktenas

Join Date: Nov 2018

Posts: 16
#6

05 Nov 2018, 09:35

I see. Thank you very much Dear Clyde for your clear answers!
Comment
Vaggelis Ktenas

Join Date: Nov 2018

Posts: 16
#7

05 Nov 2018, 11:52

One more question Mr. Clyde,

What about interactions? Can i insert interactions between i.treat and continuous factors (i.e. c.variable##i.treat) in G. DiD model so as to see the effect of treatment interacting with an other factor? Would then the generalized DID estimator be also the coefficient of the i.treat?
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30118
#8

05 Nov 2018, 14:55

You can include interactions between treat and other variables (continuous or discrete), but when you do that, there is no longer any such thing as "the generalized DID estimator." By using an interaction model, you are making the claim that there is no overall treatment effect, rather that the treatment effect depends on the values of the variables you are interacting with. So there are many different treatment effects, infinitely many if the variable you are interacting with is continuous. The coefficient of the treatment variable, in this setting, changes its meaning: it now is the treatment effect when all of the variables it is interacted with are zero. Since zero may or may not even be a possible value for some of these other variables, the coefficient of the treatment variable may actually be of no relevance at all. If the other variables can all be 0, then it might be of some interest, but it is usually not going to be particularly important compared to the treatment effect at other values of the continuous interacting variable(s) that might be more frequently occurring or more "important" in some other sense.

When you do that, you have to then report treatment effects corresponding to several "interesting" values of the continuous variables you are interacting with. The best way to do this in Stata is with the -margins- command, and it is usually best to display the results graphically with the -marginsplot- command. I highly recommend you look at https://www3.nd.edu/~rwilliam/stats/Margins01.pdf, by the excellent Richard Williams. It's a very clearly written introduction to the use of the -margins- command and it includes several worked examples, including interaction models, so you will get a sense of how to do it.
Comment
Vaggelis Ktenas

Join Date: Nov 2018

Posts: 16
#9

05 Nov 2018, 15:13

Most variables can't be 0 but there are some (at least one i have in mind).. So i should interacted with the second ones the treat variable. It would be better to examine the interactions with the variables that can be 0 in an other model? Such as RE, FE or just pool model including the treatment variable?
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30118
#10

05 Nov 2018, 16:27

You can still interact the treatment variable with continuous variables that can never be zero in the same model. It just means that the coefficient of the treatment variable becomes meaningless and you must calculate the treatment effects at specific, meaningful values of the other variables. The -margins- command makes that pretty easy.
Comment
Vaggelis Ktenas

Join Date: Nov 2018

Posts: 16
#11

07 Nov 2018, 15:12

I have a question.. again. If there is heterogeneity among groups and the appropriate model is RE then is it wrong to adjust the DID model? I understand that in DID method there are other factors (except treatment) that affect the dependent variable, but difference of two groups eradicates them. Furthermore, the all idea of this method is to focus on the treatment and to specific years so we need a model to control for the effects of group. My question is what if my appropriate baseline model is RE ? Should i apply DID and G. DID method anyway?

Thank you
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30118
#12

07 Nov 2018, 15:26

It is certainly permissible to analyze DID models with random effects instead of fixed effects if the usual assumptions underlying re are met. I can't see any reason why the same wouldn't be true of generalized DID, though I also have never actually seen it done.
Comment
Vaggelis Ktenas

Join Date: Nov 2018

Posts: 16
#13

07 Nov 2018, 15:41

Me neither..(in the 1/100 i 've read compared to you), that's why it strikes me to be wrong. So i can use Re without worrying.

Thank you, I was helped very much from all your answers.
Comment
Michael Loden

Join Date: Feb 2020

Posts: 2
#14

27 Feb 2020, 08:27

Clyde Schechter

Hi Clyde,

I would like to learn more about the generalized diff in diff approach and I have seen that you have recommended the following link on several different threads:
https://www.ipr.northwestern.edu/wor.../Day%204.2.pdf
.

I opened the link but was not able to access the paper/report.
Do you by any chance know what the name of the paper/report is so I can look it up?

Thank you in advance
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30118
#15

27 Feb 2020, 11:09

Yes, that link has been broken for some time and the paper seems to have disappeared from the internet. In my more recent posts I have been referring people instead to:

https://www.annualreviews.org/doi/pd...-040617-013507

FWIW, I have also gotten more positive feedback about this new reference than the old one.
Comment

Announcement

Difference in Differences model with more than two periods of treatment in panel data

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment