Difference-in-difference with a time variant treatment variable

Clyde Schechter

Join Date: Apr 2014

Posts: 30084
#16

07 Aug 2018, 10:33

Re #14.

say we have firms in new york vs. new jersey, the former had legislation changes in multiple years. firms new jersey is control

It depends on whether the legislation changes constitute the same intervention being introduced at different times to different populations of firms, or whether each legislation change is actually a different intervention.

Re #15.

Each region doesn't correspond to a single mine necessarily but to a specific cluster. Should I add i.cluster instead then?

Well, if you have a variable corresponding to the single mines, it wold be best to add that. If that information is not available, then I suppose adding i.cluster is the next best thing.
Comment
emna khemakhem

Join Date: Dec 2017

Posts: 10
#17

08 Aug 2018, 04:39

Hi clyde,
I have a question about the diff in diff model and i hope to benefit from your expertise on the subject.
I would like to use Diff-in-Diff to analyse effects of taxing the options market in one country. My threated group is an index of option that will ne taxed and my control group will be an other index that is not subject to the tax. I have daily data between 2015 and 2016.

So my question is ; Can I have only one price index in threted group and 1 index in the control group ?
Thank you

Best regards,

Emna
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30084
#18

08 Aug 2018, 09:21

Well, you can, but your causal inference will be rather iffy.

One of the limitations of difference-in-differences as a strategy for identifying a causal effect is that it really just demonstrates that at a certain point in time, the time when the intervention occurred, the treatment group, previously showing outcomes parallel to those of the control group, deviates in a way that the control group does not. Now, while we like to infer from this that the intervention itself caused the deviation, in fact if there is anything else that happened to the treatment group but not the control group, concurrent with the intervention, that something else is an equally good candidate for the cause of the deviation.

When the treatment and control groups are large, it is unlikely that there will be other things that happened to treatment but not control just at that time--it becomes a far-fetched coincidence. But if you have only one treatment entity and one control entity, there may well be lots of other candidates for the cause of the outcome deviation.

So if you are going to do a study based on DID analysis of a single treatment entity and a single control entity, you will have to shore up your causal inference by presenting other evidence that nothing but the intervention you are studying occurred at that time affecting the treatment and control entities differently.
Comment
emna khemakhem

Join Date: Dec 2017

Posts: 10
#19

09 Aug 2018, 05:37

Thank you so much for your quick and very clear answer. Also, besides DiD method, is there any other method that may help me to capture the impact of the policy on my treatment entity ( other that just putting a dummy variable)?
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30084
#20

09 Aug 2018, 09:16

With this kind of data, I don't see any approach that is better than DiD. It's just weak data, and I think DiD will make the most of it--but in this case "the most" is not very much.
Comment
emna khemakhem

Join Date: Dec 2017

Posts: 10
#21

09 Aug 2018, 10:40

I see. Thank you so much again for your response.
Comment
Sandra Loayza

Join Date: Aug 2018

Posts: 8
#22

26 Aug 2018, 11:08

Hello again

I am still working on the topic I mentioned in the first post. I am now struggling to understand how to show the parallel trend assumption for my generalized difference-in-difference analysis, Since there is no unique before-after period for when all mines become active, rather a mine may become active on any year between 2004-2009 and then may become inactive, how could I conduct an event study or show any other type of evidence that supports the main assumption of a difference-in-difference analysis?

Any help would be greatly appreciated

Sandra
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30084
#23

26 Aug 2018, 14:48

There are three simple approaches, each partially satisfactory.

The first is to look at the trends in both groups up to the time of the first intervention. At that point all the treated entities are still untreated and the trends up to that point should be parallel. Of course, this says nothing about what happens at any later point in time to those in the treatment group who are still not treated but later will be.

The second way is to create a new time variable: time_before_intervention = intervention_date - date. So this will be positive prior to the intervention and negative after. You then extend the definition of time_before_intervention to the control group by pretending that they, too, underwent the intervention in the year after they were last observed. That is, it takes on the value 1 in the last year they are observed, 2 in the year before that, 3 in the year before that, etc. You then plot outcome vs time_before_intervention separately for the treatment and control groups and examine whether the curves appear parallel. The limitation of this approach is that if there is a secular trend in the outcome, that can distort the results because, for example, time_before_intervention = 1 is a different year in treated entities than it is in the controls, so there is confounding that can make it appear the trends are not parallel even if they "really" are. But if there is no secular trend seen in the control group, then this approach is reliable.

A third approach is to simply plot outcome vs time (regular calendar time) in the treatment and control groups, but exclude from the graphs any observation that occurs at or after the time of treatment. So all points in the graph represent untreated results, but they are distinguished by membership in the treatment and control groups, and all points in time are covered. This is probably the best of the three methods overall. The mean drawback is that towards the later time periods, the number of points in the treatment graph gets small and the treatment graph can get very erratic.
1 like
Comment
Sandra Loayza

Join Date: Aug 2018

Posts: 8
#24

27 Aug 2018, 04:52

Thanks a lot for your reply Clyde! I'll take a look at the 3 approaches
Comment
Tom Lee

Join Date: Aug 2021

Posts: 1
#25

31 Aug 2021, 11:42

Hello everyone. Mr. Schechter responded with the code. Is this an example of a generalized did model? Thank you!
gen under_treatment = (active == 1) & (near == 1) if !missing(active, near) xtset variable_identifying_individual xtreg outcome_variable i.under_treatment i.year perhaps_other_covariates, fe
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30084
#26

15 Sep 2021, 11:07

Well, it has the formal properties of a generalized DID model. Whether it really is one depends on what the variables active and near represent. If near is an indicator of some treatment or intervention and active is an indicator of those time periods in which those who receive treatment are actually receiving it, then it is, indeed, a generalized DID model.
Comment
Bruno Paisani

Join Date: Oct 2021

Posts: 9
#27

04 Oct 2021, 09:30

Originally posted by Clyde Schechter View Post

So, this data is not amenable to a classical difference in differences analysis, because the treatment is intermittent and its onset is not synchronized across units. Instead, you have to use a generalized difference in differences analysis. For a good explanation of the approach, take a look at

https://www.ipr.northwestern.edu/wor.../Day%204.2.pdf

Your actual treatment variable here is neither active nor near but their conjunction. So I would set this up as follows:

Code:

gen under_treatment = (active == 1) & (near == 1) if !missing(active, near) xtset variable_identifying_individual xtreg outcome_variable i.under_treatment i.year perhaps_other_covariates, fe

The coefficient of under_treatment is then your generalized DID estimate of the effect of living near a mine when it is active.

The perhaps_other_covariates part of the model should include other variables that are relevant to the outcome and are not simply unchanging fixed attributes of the individual. In particular, if there are variables describing properties of the mines themselves which are relevant to the outcome, I recommend including those. (However, if an individual is always living near the same mine and if the mine's attributes do not change over time, then these will also be constant within individual and will be omitted due to colinearity.)

Dear Clyde, good morning.

The URL you posted a long time ago about GDD is not available any more. Could you point out to a different link where the exact material can be found?
That would be very useful to me as I am struggling to find a way out to a similar research design where (insurance) companies are "on and off" supervised during a certain period of analysis.

Thanks a million.
Bruno.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30084
#28

04 Oct 2021, 09:54

Yes, that link got taken down, and I do not know where the underlying material can be found any more. Instead, I recommend the following link: https://www.annualreviews.org/doi/pd...-040617-013507. This one still works as of this writing.
Comment
Bruno Paisani

Join Date: Oct 2021

Posts: 9
#29

04 Oct 2021, 11:00

All right, thanks!
Comment

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment