Fixed effect difference-in-differences model

dupont john

Join Date: Dec 2015

Posts: 49
#16

19 Mar 2016, 17:59

Hi Clyde,

Another Issue I am worried is whether I should have the same amount of year before and after my "Treatment" takes place? For example I saw that papers usually use 10 year before and 10 year after. Is this an obligation when we use a difference in differences model?

Thanks!!

Best,

JD
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30147
#17

19 Mar 2016, 18:12

No, it is not an obligation. You need some pre-treatment-era and some post-treatment-era data in both groups. To the extent you have more of it (but not extending over time periods so long that relevant conditions not accounted for in your model change) you will get more precise estimates of the effect of treatment.

Assuming that outcome variation is homogeneous over time, the most efficient design for the same total amount of data would be to have equal durations before and after the change-point. That may well be why people commonly chose to use equal durations of observation before and after. But you can perfectly well do it with more on one side of the change-point than the other. After all, maximizing efficiency for the same total amount of data is only optimal if data from all time periods is equally easy to get.
Comment
dupont john

Join Date: Dec 2015

Posts: 49
#18

19 Mar 2016, 21:46

Thanks clyde!

Also another I was worried about, if I have GDP growth in my regression with is already is percentage should I still put a log for this variable? even if it is already in percentage?

Thanks a lot!

JD
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30147
#19

20 Mar 2016, 11:34

That question is beyond the scope of my knowledge and expertise. It depends on whether the rate of growth in GDP is related to your outcome variable in an additive or multiplicative way. That's not a statistics question, it's a question in your discipline. If I were you I'd ask a colleague in the discipline about that.
Comment
Henrik Dalriksson

Join Date: Apr 2017

Posts: 29
#20

29 May 2017, 13:49

Clyde Schechter

Are there any situations where you would like to keep firm fixed effects and still include TREAT?

Are there any arguments for why you would like to keep TREAT in this particular regression when you still have firm fixed effects?

Last edited by Henrik Dalriksson; 29 May 2017, 13:54.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30147
#21

29 May 2017, 17:32

Are there any situations where you would like to keep firm fixed effects and still include TREAT?

If we are still talking about the situation presented at the beginning of this thread, it is not a question of whether one would like to keep firm fixed effects and still include TREAT. It is simply not possible.
Comment
Marisa Foraci

Join Date: Mar 2017

Posts: 10
#22

30 May 2017, 04:49

Dear all,

I am having a doubt over a -xtset regression run for a panel dataset. In order to explain unit change of my predictors in percent change of my outcome variable I am transforming it in its logged version. Now the Manual I am using as a reference for the DID syntax, transforms variables into logs using the function
gen newvar= ln(1+x), while I usually transform variables into their log versions by using gen newvar= ln(x). When comparing the means of the two logs I get the following:

Mean estimation Number of obs = 6380

-------------------------------------------------------------------------------
Mean Std. Err. [95% Conf. Interval]

ln .9707089 .0090013 .9530633 .9883545
ln1plusx 1.344655 .0060332 1.332828 1.356482

The mean is higher, the standard error is lower and the conf. interval narrower.
Any hint on why one form could preferred to the other?

Thank you in advance
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17724
#23

30 May 2017, 05:05

Marisa:
as this thread has a little to do with the one started by the original poster, for the future please start a new thread. Thanks.
The habit of adding a "small" (unortunately, an unsatisfactory qualitative statement for shed light on a quantitative issue) to avoid missing values when the raw value of the variable to be logged is <=0 is difficult to justify.
When I'm forced/urged to log the original variable, I do not add anything and bear the consequence of that choice in terms of a reduced sample size.
No need to say that any imputation practice is, in this case, out of debate.

Kind regards,
Carlo
(Stata 19.0)
Comment
Pietro Fera

Join Date: Jun 2017

Posts: 3
#24

09 Jun 2017, 06:56

Hello everyone,
I'm a really beginner in the DID field and probably I'll ask "stupid" questions, but I would be grateful if you could help me.

I have a set of many firms' accounting data from 2001 to 2015.
I would analyze the impact of the IFRS adoption on the level of matching between revenues and expenses. This means that I have a treatment group represented by those firms that have adopted IFRS, and a control group of firms that do not use IFRS.
So, I have these variables:
MATCHING = dependent variable
IFRS = dummy variable that is 1 for those firms that have adopted IFRS, and 0 otherwise
TIME = dummy variable that is 1 for every year after the IFRS adoption, and 0 before for the years the IFRS adoption

My problem is related to the treatment period (and so to the variable TIME) because I'm dealing with voluntary IFRS adoption and, therefore, firms have adopted IFRS in different year starting from 2006. In this case the dummy variable that represent the treatment period can vary in the treatment group (in my case, it will be 1 for each firm i that does IFRS reporting in year t), but what happens to the same time variable for the control group? I don't think that it could be right if the time variable will always be zero for the control group. In fact, when I use the STATa commands, they don't work properly. So, should the time variable (for the control group) be 1 from 2006 to 2015 (that represent the period when the treatment starts, even if not for the whole treatment group)?

Or, in this case, the specification (Y = a0 + a1TREAT*POST + YearsDummies + FirmsAttributes) is not correct anymore and I should do something different?

Thanks a lot for the answer!
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17724
#25

09 Jun 2017, 07:03

Pietro:
welcome to the list.
What if you try:

Code:

xtset firmID year xtreg MATCHING i.IFRS##i.TIME <othercontrols>, fe

Kind regards,
Carlo
(Stata 19.0)
Comment
Pietro Fera

Join Date: Jun 2017

Posts: 3
#26

09 Jun 2017, 07:30

Originally posted by Carlo Lazzaro View Post

Pietro:
welcome to the list.
What if you try:

Code:

xtset firmID year xtreg MATCHING i.IFRS##i.TIME <othercontrols>, fe

Hi Carlo.

Thank you for the answer.
I see your point, but I have a problem with the variable TIME: I don't know when it has to be 1 or 0, especially for the control group.
I mean... I think I understand that I can have different treatment period for each firm of my treatment group, correct me if I'm wrong please, but in this case if the TIME variable for the control group is always 0, the command doesn't work (look at the image).
So.. I was thinking that the right way is that for the control group, the variable TIME should be 1 for all years from 2006 to 2015, while for the treatment group the variable TIME should be 1 only for the years in which each firm adopts the IFRS, independently from the fact that the first IFRS adoption for some firms was in 2006.
Anyway.. I really don't know if what I'm thinking is really doable.

Thanks for the help.
Attached Files

Last edited by Pietro Fera; 09 Jun 2017, 07:32.
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17724
#27

09 Jun 2017, 08:21

Pietro:
I see your concern.
What if you replace -i.TIME- with -i.YEAR- in your code (and get rid of -i.TIME-):

Code:

xtset firmID year xtreg MATCHING i.IFRS##i.YEAR <othercontrols>, fe

Kind regards,
Carlo
(Stata 19.0)
Comment
Pietro Fera

Join Date: Jun 2017

Posts: 3
#28

09 Jun 2017, 09:00

Originally posted by Carlo Lazzaro View Post

Pietro:
I see your concern.
What if you replace -i.TIME- with -i.YEAR- in your code (and get rid of -i.TIME-):

Code:

xtset firmID year xtreg MATCHING i.IFRS##i.YEAR <othercontrols>, fe

What would represent the variable YEAR in this case? It should be a continuous variable from 2001 to 2015?

How could I catch the difference between pre and post treatment period in this way?

I'm getting out of my mind with this.. thank you

Last edited by Pietro Fera; 09 Jun 2017, 09:07.
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17724
#29

09 Jun 2017, 09:31

Pietro:
-i.YEAR- should be a categorical variable (such as 2011, 2012 ans so forth) that identifies the panel time variable.
The problem with -i.TIME- is that it reduces interaction to zero.
Try to add it outside the interaction:

Code:

xtset firmID year xtreg MATCHING i.IFRS##i.YEAR i.TIME <othercontrols>, fe

Kind regards,
Carlo
(Stata 19.0)
Comment
Gaurav Dhamija

Join Date: May 2016

Posts: 35
#30

16 Jun 2017, 12:05

Hi,
I have created an artificial data set similar to what I have in my main analysis. Using this data set I was trying to understand how fixed effect analysis works.

Say kids in the age group of 2 to 6 are treated with some kind of exposure and hence my treatment is based on the age cohorts where kids lying in the age cohort 2 - 6 are treated whereas 18-28 is controlled. And this exposure is defined at the district level so districts 1 and 3 were affected by the exposure whereas 2 was not.

height age district affect treat

2 4 1 1 1

5 18 1 1 0

2.2 6 1 1 1

2.1 5 2 0 1

5.5 28 2 0 0

5.4 20 2 0 0

2.1 6 2 0 1

1.7 4 3 1 1

1.8 4 3 1 1

2 6 3 1 1

Now if I try to do cohort based DID analysis then my DID coefficient will be found using the following command.
- reg height i.treat#i.control

Here coefficient for interactive term will give me DID estimator.

Now if i try control for age fixed effects then i use - reg height i.treat##i.affect i.age

But in the results, it shows

note: 1.treat#1.affect omitted because of collinearity
note: 28.age omitted because of collinearity

I understood coefficient for age 28 has been omitted but I could understand why the interactive term has been omitted due to collinearity. Can you please elaborate on this?

Next, i try to estimate the impact of 'treat' on height controlling for age-district fixed effects. I was expecting that that in order to get the coefficient for 'treat' I must have one treated and one controlled observation for every age-district combination. Surprisingly I got the following results

reg height treat i.age#i.district

note: 4b.age#2.district identifies no observations in the sample
note: 5.age#1b.district identifies no observations in the sample
note: 5.age#3.district identifies no observations in the sample
note: 18.age#2.district identifies no observations in the sample
note: 18.age#3.district identifies no observations in the sample
note: 20.age#1b.district identifies no observations in the sample
note: 20.age#3.district identifies no observations in the sample
note: 28.age#1b.district identifies no observations in the sample
note: 28.age#2.district omitted because of collinearity
note: 28.age#3.district identifies no observations in the sample

So I could not understand why I am getting the result for 'treat'and how should I interpret this. In addition to this, I did not get why "28.age#2.district omitted because of collinearity"

Attached Files

Last edited by Gaurav Dhamija; 16 Jun 2017, 12:19.
Comment

height	age	district	affect	treat
2	4	1	1	1
5	18	1	1	0
2.2	6	1	1	1
2.1	5	2	0	1
5.5	28	2	0	0
5.4	20	2	0	0
2.1	6	2	0	1
1.7	4	3	1	1
1.8	4	3	1	1
2	6	3	1	1

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment