Estimating diff-in-diff with multiple, cumulative treatment effects.

William Hawk

Join Date: Mar 2015
Posts: 9

Estimating diff-in-diff with multiple, cumulative treatment effects.

29 Oct 2016, 07:56

I am attempting to use difference-in-difference estimation to measure the effect of new machines on appointments at a hospital. These new machines allow doctors to see more patients.

I have data from 2000-2015 by doctor and year. Hospital A (my treatment group) received 10 new machines in 2012, 12 in 2013, and 15 in 2014. Hospital B (my control group) received no new machines.

I would like to isolate the effect of having all 37 of the machines on appointments (but I am open to suggestions if I should be doing something else). If I use 2012 to 2015 as the time binary then I am underestimating the effect of all the machines. If I use 2014 and 2015 then I am comparing 37 machines to zero machines (before 2012), 10 machines (2012), and 22 machines (2014).

Besides dropping data from 2012 and 2013 is there a way to estimate these effects?

Code:

collapse(sum) appointments, by(date hospital)
generate time = (date >= date("2012", "Y"))
generate treated = (hospital=="A")

reg appointments time##treated

Date	Doctor	Hospital	Appointments	Treated	Time
2010	0001	A	20	1	0
2010	0002	A	11	1	0
2010	0003	B	51	0	0
2010	0004	B	7	0	0
2011	0001	A	22	1	0
2011	0002	A	14	1	0
2011	0003	B	42	0	0
2011	0004	B	5	0	0
2014	0001	A	26	1	1
2014	0002	A	17	1	1
2014	0003	B	60	0	1
2014	0004	B	9	0	1

Tags: difference, difference-in-difference, regress

Clyde Schechter

Join Date: Apr 2014

Posts: 30174
#2

29 Oct 2016, 09:37

I would like to isolate the effect of having all 37 of the machines on appointments (but I am open to suggestions if I should be doing something else).

I don't think this is possible with the data designed in this way. When the last batch of new machines was delivered, they entered an environment in which the hospital staff were already quite experienced at using these machines for a number of years. Whatever "learning curve" issues may have attended their initial introduction in 2012, would be quite attenuated by now. The circumstances wee completely different. (This is true even if the doctors who got the new machines in 2015 never got to use any of the earlier ones: support staff will have worked with them. Logistical issues associated with storage, maintenance, etc., would have been ironed out. Etc.) So all that the data surrounding 2015 can inform you of is the incremental impact of adding 15 new machines in a context where there were already 22 in use.

What I think you can estimate very nicely is the initial impact of their introduction and the subsequent impact of each increment. Something like this:

Code:

collapse (sum) appointments, by(date hospital) rename date year gen byte era = 0 if year < 2012 replace era = 1 if year == 2012 replace era = 2 if year == 2013 replace era = 3 if year >= 2014 label define era 0 "No machines" 1 "10 Machines" 2 "22 Machines" 3 "37 Machines" label values era era gen byte treated = (hospital == "A") poisson appointments i.era##i.treated margins era#treated margins r.era, at(treated = (0 1))

Note that I'm inclined towards -poisson- rather than -reg- here because your variable is inherently a count, and the values are, in some cases, pretty small, so a normal error structure doesn't seem very suitable.

Let me also point out that if the data shown are the complete available data, it is really to sparse for heavy-duty analysis. With only 12 observations, even if we stuck with a binary time variable (before vs after 2012, say) there would be 3 variables. With my approach, there are 7! That doesn't leave many degrees of freedom for estimating things. I would hope that there are more hospitals in the real study in both arms, or perhaps more time periods available (perhaps appointment data could be disaggregated to quarterly or monthly observations). If not, you might be better off just graphing the number of appointments as a function of time in each hospital and foregoing fancy statistics.

Last edited by Clyde Schechter; 29 Oct 2016, 09:46.
1 like
Comment
William Hawk

Join Date: Mar 2015

Posts: 9
#3

30 Oct 2016, 12:04

Clyde,

I want to sincerely thank you; this is a big help.

This is just a mock-up of my data. I actually have monthly data by doctor. Once I collapse the data by hospital A and B I go from thousands of obs to 100.

Thanks again,

William
Comment

Announcement

Estimating diff-in-diff with multiple, cumulative treatment effects.

Comment

Comment