Visualizing treatment effect over time from difference in difference

Justin Niakamal

Join Date: Aug 2017

Posts: 760
#1

Visualizing treatment effect over time from difference in difference

17 Dec 2019, 13:28

I have a question that relates to difference in difference estimation and plots of treatment effects over time. I have created some toy data to illustrate and would just like to verify that this is the correct approach.

To illustrate the effect of a treatment over time you would do something like:

Code:

xtreg y i.treated##i.year x , fe margins year, dydx(treated) noestimcheck marginsplot

Whereas the standard set up would look something like

Code:

xtreg y x i.treated##i.post i.year , fe

Is that correct?

Here’s what I see in many journals. Just would like to verify that above is correct.

Lastly, how would I reconfigure the code above to create “years relative to the intervention” (which in this case is 2011)?

Here’s the toy data:

Code:

clear set more off set seed 12345 set obs 10 gen int firm = _n expand 15 bys firm: gen year = 2000 + _n gen y = runiform(10,50) gen x = runiform(1,20) gen int treated = (firm >= 8) gen int post = (year >= 2011) replace y = 50 + x if year >= 2011 & treated == 1 xtset firm year * standard did xtreg y x i.treated##i.post i.year , fe * effect over time xtreg y i.treated##i.year x , fe * plot effect over time margins year, dydx(treated) noestimcheck marginsplot , xline(2011) level(50) xlab(, angle(v)) xtitle("")

Thanks,

Justin
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30101
#2

17 Dec 2019, 14:10

Is that correct?

Well, that is a correct approach to generalized difference-in-differences estimation. But since in your data the intervention begins at the same time (2011) for all entities, there is no need to use generalized diff-in-diff; stick to the simpler "standard setup." It's easier to interpret.

Lastly, how would I reconfigure the code above to create “years relative to the intervention” (which in this case is 2011)?

Code:

gen years_relative = year - 2011
Comment
Justin Niakamal

Join Date: Aug 2017

Posts: 760
#3

17 Dec 2019, 14:18

Thanks for your help as always, Clyde. I have a couple of followup questions if you don't mind.

Well, that is a correct approach to generalized difference-in-differences estimation. But since in your data the intervention begins at the same time (2011) for all entities, there is no need to use generalized diff-in-diff; stick to the simpler "standard setup." It's easier to interpret.

How would I visualize an effect over time in the standard setup? I tend to see these types of charts in journals and would like to know how to set them up in a difference in difference framework.

Lastly,

Code:

gen years_relative = year - 2011

Does this mean

Code:

xtreg y x i.treated##i.post i.year , fe

becomes

Code:

xtreg y x i.treated##i.years_relative i.year , fe

or

Code:

xtreg y x i.treated##i.years_relative i.years_relative , fe
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30101
#4

17 Dec 2019, 14:55

How would I visualize an effect over time in the standard setup?

The same -margins- and -marginsplot- commands you showed will work in the standard setup as well.

Does this mean
Code:

xtreg y x i.treated##i.post i.year , fe

becomes
Code:

xtreg y x i.treated##i.years_relative i.year , fe

or
Code:

xtreg y x i.treated##i.years_relative i.years_relative , fe

Either one would be fine. The results will be the same, except for the cdonstant term. Possibly, just for ease of understanding, the second approach is preferable.
Comment
Justin Niakamal

Join Date: Aug 2017

Posts: 760
#5

17 Dec 2019, 15:26

Thanks, Clyde.

I really hope you'll one day consider writing a practitioner's guide to difference in difference studies using Stata!
1 like
Comment
Stephen Ch

Join Date: Apr 2022

Posts: 67
#6

01 Aug 2022, 10:10

Hi Clyde, I came across your reply to Justin's post.

I am trying to produce the same plot, and I have used xtdidregres.

After running the xtdidregress, I have used estat grangerplot to produce the time-specific ATET over time.

I am getting the error message that states "treatment assignment times vary; not allowed with estat grangerplot".

My data structure is a panel data set similar to that of the artificial example provided by tedidregress Stat manual page. (use https://www.stata-press.com/data/r17/hospdd
(Artificial hospital admission procedure data)

My question is:

1. How do I first inspect where the time-varying assignments exist in my data set? I have a quarterly data set where I assign the beginning of the treatment at 2019q1, so I don't understand how the treatment assignment varies in this case.

2. Even with the time-varying treatment assignments, is there a way to produce the plot Justin originally requested or a similar one to the grangerplot?

Help is much appreciated!

Thanks.

Originally posted by Clyde Schechter View Post

Well, that is a correct approach to generalized difference-in-differences estimation. But since in your data the intervention begins at the same time (2011) for all entities, there is no need to use generalized diff-in-diff; stick to the simpler "standard setup." It's easier to interpret. [/FONT][/COLOR][/LEFT]

Code:

gen years_relative = year - 2011
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30101
#7

01 Aug 2022, 10:20

I'm sorry, but I am not famliar with the -xtdidregress- and -estat grangerplot- commands, so I cannot advise you on their use.

Concerning your first question, assuming that your panel identifier is called panelid, your time variable is called time, and your pre vs post intervention indicator is called pre_post (with 0 = pre and 1 = post) you can do this:

Code:

tabstat time if pre_post == 1, by(panelid) statistics(min)

and you will see the starting time for intervention in each panel.
Comment
Stephen Ch

Join Date: Apr 2022

Posts: 67
#8

01 Aug 2022, 12:35

Hi Clyde, this is extremely helpful.

Let me dig into this more.

Thanks a lot!

Originally posted by Clyde Schechter View Post

I'm sorry, but I am not famliar with the -xtdidregress- and -estat grangerplot- commands, so I cannot advise you on their use.

Concerning your first question, assuming that your panel identifier is called panelid, your time variable is called time, and your pre vs post intervention indicator is called pre_post (with 0 = pre and 1 = post) you can do this:

Code:

tabstat time if pre_post == 1, by(panelid) statistics(min)

and you will see the starting time for intervention in each panel.
Comment
Stephen Ch

Join Date: Apr 2022

Posts: 67
#9

01 Aug 2022, 18:48

Hi Clyde,

I inspected my data to see in which pandelid I have such cases.

The basic issue I face is that some panelid does not have data points at the treatment assigned quarters, so they start some quarters later.

I would like to drop any panelid that have data points start later than the treatment assignment date.

For example, consider this example: the treatment begins at 2020 Quarter 1.

hospital year qtr yrqtr pre_post

1 2019 4 2019q4 0

1 2020 1 2020q1 1

1 2020 2 2020q2 1

2 2021 1 2021q1 1

2 2021 2 2021q2 1

Here, the panelid is hospital, and hospital 2's data starts at 2021 Quarter 1, so I would like to drop hospital 2 altogether from the data.

What might be the best to do this?

I thought about something like:

Code:

drop if pre_post ==1 & yrqtr !=2020q1

But, then I realized it would also drop hospital 1's 2020Q2 data point.

Any advice would be much appreciated!

Thanks.

Originally posted by Clyde Schechter View Post

I'm sorry, but I am not famliar with the -xtdidregress- and -estat grangerplot- commands, so I cannot advise you on their use.

Concerning your first question, assuming that your panel identifier is called panelid, your time variable is called time, and your pre vs post intervention indicator is called pre_post (with 0 = pre and 1 = post) you can do this:

Code:

tabstat time if pre_post == 1, by(panelid) statistics(min)

and you will see the starting time for intervention in each panel.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30101
#10

01 Aug 2022, 19:00

I would like to drop any panelid that have data points start later than the treatment assignment date.

Code:

* Example generated by -dataex-. For more info, type help dataex clear input byte hospital int year byte(qtr pre_post) float yrqtr 1 2019 4 0 239 1 2020 1 1 240 1 2020 2 1 241 2 2021 1 1 244 2 2021 2 1 245 end format %tq yrqtr by hospital (yrqtr), sort: egen earliest_post_quarter = /// min(cond(pre_post, yrqtr, .)) drop if earliest_post_quarter > tq(2020q1)

In the future, when showing data examples, please use the -dataex- command to do so, as I have here. If you are running version 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.
Comment
Stephen Ch

Join Date: Apr 2022

Posts: 67
#11

02 Aug 2022, 06:47

Hi Clyde,

Thanks for the comments on the -dataex- command. I just got started on this forum, and this convention seems very helpful.

Just a quick question on your conditional statement. you have pre_post as the first argument, but what does it mean for this to be true?

I look at cond command on Stata, but without any Boolean logic, I am not sure how your command would return?

Any clarification would be appreciated.

Once again, thanks!

Originally posted by Clyde Schechter View Post

Code:

* Example generated by -dataex-. For more info, type help dataex clear input byte hospital int year byte(qtr pre_post) float yrqtr 1 2019 4 0 239 1 2020 1 1 240 1 2020 2 1 241 2 2021 1 1 244 2 2021 2 1 245 end format %tq yrqtr by hospital (yrqtr), sort: egen earliest_post_quarter = /// min(cond(pre_post, yrqtr, .)) drop if earliest_post_quarter > tq(2020q1)

In the future, when showing data examples, please use the -dataex- command to do so, as I have here. If you are running version 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30101
#12

02 Aug 2022, 09:06

In Stata, when a variable or numeric expression is used in a Boolean context, 0 is interpreted as false, and anything other than zero (including missing value) is interpreted as true. So in the -cond()- function, when pre_post is zero, it will return ., and when pre_post is anything else (which, for this variable is just 1), it will return yrqtr.
Comment

hospital	year	qtr	yrqtr	pre_post
1	2019	4	2019q4	0
1	2020	1	2020q1	1
1	2020	2	2020q2	1
2	2021	1	2021q1	1
2	2021	2	2021q2	1

Announcement

Visualizing treatment effect over time from difference in difference

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment