Difference in difference regression

Ludmila Farooq

Join Date: Mar 2017

Posts: 97
#1

Difference in difference regression

30 Aug 2017, 01:53

Does anybody know what (panel data) model is used here?
Id does not look like a notmal DiD regression to me, can anybody help me what kind of DiD regression they have used here ? :

𝑦_m,r,t= 𝛼 + 𝛾𝐷_m,t+ 𝛽𝑋_m,t+ 𝛿_r + 𝜎_t + 𝜀_m,r,t
Where Ym,r,t is the outcome of interest for municipality m in region r at time t. α is a constant.
Dm,t is an indicator variable equal to zero for a municipality before it receives a hydroelectric
power plant, and equal to one after one is built. Xm,t is a set of municipality specific
characteristics that are time variant, including doctors per capita, the number of people in
poverty, and taxable income per taxpayer. δr is a set of indicator variables at regional level,
allowing for differences in levels between the five regions.

Thanks in advance
Tags: None
Abdul Adam

Join Date: Dec 2016

Posts: 76
#2

30 Aug 2017, 03:06

Ludmila:
I think you might get clarity on this by reading
Wooldridge, J.(2013). Introductory Econometrics: A Modern Approach. 5th Edition. South Western. Particularly Chapter 13.2 Policy Analysis with Pooled Cross Sections. Or the paper cited in that section:

Kiel, K. A., and K. T. McClain (1995), “House Prices during Siting Decision Stages: The Case of an Incinerator from Rumor through Operation,” Journal of Environmental Economics and Management 28, 241–255.
Comment
Ludmila Farooq

Join Date: Mar 2017

Posts: 97
#3

30 Aug 2017, 03:50

Thanks Abdul,
I have red "Wooldridge, J.(2013). Introductory Econometrics"
i dont think its about Pooled Cross section. It is panel data in the example above, what I dont uderstant is dummy variable "D " why is it only one dummy used in a DiD regression
Comment
Joshua D Merfeld

Join Date: Jun 2015

Posts: 86
#4

30 Aug 2017, 05:12

Ludmila, what do the data look like? Do you have the same municipalities over time? If so, we could certainly call it panel data. I admit I am still not completely clear what you are asking, though.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30117
#5

30 Aug 2017, 09:46

This does not look like a difference-in-differences model to me. The variable D appears to be simply a pre-post indicator variable. There does not appear to be a control grouop. Rather this looks like a simple pre-post comparison in municipalities that received a hydroelectric plant.
1 like
Comment

Ludmila Farooq

Join Date: Mar 2017
Posts: 97

06 Sep 2017, 09:43

Thanks Clyde for always answering my (not so clear) questions

I have another less complicated question.
In the dataex below you see variables:

ARSTALL = Year (time variable)
depid= Department ID
facid= Faculty ID
public= Number of publications per year in departments

1. How can create control and treatment group? (I tried to generate dummy, but for some reason it didn't work)
The control group has to be all departments with "facid = 3 and 4" , the rest of departments are treatment groups. The outcome (dependent) variable is "public" .

2. How can I control for different trends in departments?

Thanks in advance

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input int ARSTALL float(depid facid public)
2005  1 13   6
2006  1 13   5
2008  1  3   1
2010  1 10   2
2011  1  8   7
2012  1  1   6
2013  1 12   3
2014  1  1   6
2015  1  4   6
2016  1  4   1
2005  2 14   2
2006  2 14   2
2007  2 14   6
2008  2 14   2
2011  2  3   1
2012  2  3   1
2014  2  3   1
2016  2  3   1
2008  3 14   2
2006  4 14  17
2007  4 14  15
2008  4 14  11
2010  4 14   7
2006  5  5   1
2014  5 11   1
2015  5 13   1
2005  6  3  28
2006  6  3  32
2007  6  3  38
2008  6  6  28
2009  6  3  34
2010  6  3  13
2011  6  3  25
2012  6  3  42
2013  6  3  40
2014  6  3  42
2015  6  3  32
2016  6  3  48
2005  7  3  31
2006  7  3  32
2007  7  6  30
2008  7  3  29
2009  7  3  39
2010  7  3  42
2011  7  3  52
2012  7  3  58
2013  7  3  48
2014  7  3  61
2015  7  3  33
2016  7  3  77
2005  8  6   2
2006  8  6   1
2007  8  6   5
2008  8  6   4
2012  8  6   4
2013  8  6   3
2014  8  6   6
2015  8  6   3
2016  8  6   6
2005  9  4  48
2006  9  4 134
2007  9  4  74
2008  9  4 103
2009  9  4 102
2010  9  4 102
2011  9  4  96
2012  9  4 100
2013  9  4 107
2014  9  4  75
2015  9  4 120
2016  9  4 101
2005 10  4  49
2006 10  4  71
2007 10  4  35
2008 10  4  61
2009 10  4  65
2010 10  4  68
2011 10  6  79
2012 10  4  72
2013 10  6  15
2014 10  6  14
2015 10  6  13
2016 10  6  14
2005 11  4  23
2006 11  4  34
2007 11  4  32
2008 11  4  34
2009 11  4  51
2010 11  4  47
2011 11  4  93
2012 11  4  49
2015 11  4   3
2005 12  4  74
2006 12 14 110
2007 12  4 150
2008 12  4 160
2009 12  4 187
2010 12  4 184
2011 12  4 135
2012 12  4 142
end

Comment

Ludmila Farooq

Join Date: Mar 2017

Posts: 97
#7

06 Sep 2017, 09:44

obs: Post treatment time is 2005 and the treatment happens in 2006
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30117
#8

06 Sep 2017, 09:48

1. How can create control and treatment group? (I tried to generate dummy, but for some reason it didn't work)
The control group has to be all departments with "facid = 3 and 4" , the rest of departments are treatment groups. The outcome (dependent) variable is "public" .

I'm afraid "I tried to generate dummy, but for some reason it didn't work" isn't helpful: you don't show the code you tried, nor do you show what you got as a result, nor why it wasn't what you wanted. So I can't really help you fix what you did. In this case, however, it is easy enough to just give you code that does this:

Code:

gen byte arm = !inlist(facid, 3, 4) if !missing(facid) label define arm 0 "Control" 1 "Treatment" label values arm arm

2. How can I control for different trends in departments?

This question is much too broad to answer. Please ask a more specific question.
Comment
Philip Gigliotti

Join Date: Nov 2016

Posts: 118
#9

06 Sep 2017, 09:54

Difference in differences can be generalized to a two way fixed effects model by dropping the treatment group and post treatment indicators and adding group and year fixed effects.

traditional diff in diff is:

Code:

reg y treatment post treatment*post

this can be generalized to two way fixed effect by:

Code:

xtreg y treatment*post i.year, fe

the group fixed effect soaks up all time invariant characteristics of the group, including treatment status. The year effects soak up all year specific characteristics including post treatment status.

Econometrics is moving towards the generalized two-way fixed effect diff in diff, as it is more robust. Using year effects instead of a post treatment dummy is always necessary with more than two periods or the result will be biased (see Wooldridge). However more people are generalizing the treatment fixed effect to a group fixed effect. Its probably best to provide both models.

one which would be

Code:

reg y treatment treatment*post i.year

and one which would be

Code:

xtreg y treatment*post i.year, fe
1 like
Comment
Ludmila Farooq

Join Date: Mar 2017

Posts: 97
#10

06 Sep 2017, 09:56

I was refering to " common trends assumption" in DiD .
How to relax the common trent assumption when control and treatment groups don't have "paralell " tredns over time
Comment
Ludmila Farooq

Join Date: Mar 2017

Posts: 97
#11

06 Sep 2017, 10:02

Thank you very much Philip
Comment
Philip Gigliotti

Join Date: Nov 2016

Posts: 118
#12

06 Sep 2017, 10:04

Originally posted by Ludmila Farooq View Post

I was refering to " common trends assumption" in DiD .
How to relax the common trent assumption when control and treatment groups don't have "paralell " tredns over time

The parallel trends assumption can be relaxed by adding group specific polynomial time trends to the model.

The parallel trends assumption requires that trends have the same slope or first derivative. Adding a group specific linear time trend changes the assumption such that trends may now differ in the first derivative but must be similar in the second derivative. This is the same as saying that trends can diverge provided they are linear trends. They cannot differ in the quadratic dimension, which would be likely provided there was no shock during the pretreatment period that only affected treatment groups.

You can relax the assumption further by adding both group specific linear trends and group specific quadratic trends. Now trends can differ in the first and second derivative but not the third.

See mora and reggio (2017) flexible diff in diff with alternate parallel assumptions (working paper available online.)
1 like
Comment
Ludmila Farooq

Join Date: Mar 2017

Posts: 97
#13

06 Sep 2017, 11:41

Thank you Philip
Could you please give me the codes for " group specific polenomial time trends"
(Couldnt find the paper by the way)
Comment
Philip Gigliotti

Join Date: Nov 2016

Posts: 118
#14

06 Sep 2017, 11:48

The working paper is mora and reggio 2012, treatment effect identification using alternate parallel assumptions.

To implement polynomial time.trends you need to create a time trend variable which is coded 1 for the first year in your panel, 2 for the second year...n for the nth year. You then add both fixed effects and an interaction of that variable and your panel id dummies or your treatment group dummy. The second approach uses leas degrees of freedom and may more easily yield a significant coefficient.

For higher degree polynomial trends, square your time trend in a new variable and include the squared variable interacted with your panel.or group dummies to the model with linear trends. It's important to include fixed effects and all lower degree terms in these models as you add higher degree trends.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30117
#15

06 Sep 2017, 12:08

Philip Gigliotti I notice an acceleration in your postings here on the Forum. Thank you for your contributions. One suggestion (not originally mine, it's in the FAQ): when giving a reference give the complete reference information (or provide a link if it's an online source). Maybe everybody in econometrics knows what mora and reggio 2012 (or is it 2017, or are those two different papers by the same authors) is, and probably Ludmila Farooq will, too, as it seems like she is in your field. But these posts are read by other people who may be interested in learning the methodology. So as a courtesy to them and to make the Forum maximally helpful to those who read it but don't post, it is best to show complete references.

One suggestion regarding polynomial trends of higher than quadratic degree. If your T is at all large, when you get to T³ and higher powers, you will be generating variables with some very large numerical values. Combined with other variables in the same data set that have smaller values, these can lead to numerical difficulties when Stata tries to estimate the models and may, ultimately, cause convergence failure. So when using variables like that it is a good idea to either re-scale them, or use T centered at some reasonable value (mean, median, etc.) and its powers for the time trends. The centering approach has the additional slight advantage that it reduces colinearity among the time trend powers, so that if estimation of those effects is directly of interest, you can do that with greater precision. (Admittedly, usually these trends are incorporated for adjustment purposes only, but sometimes they are of interest.)
3 likes
Comment

Announcement