Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Estimating diff-in-diff with multiple, cumulative treatment effects.

    I am attempting to use difference-in-difference estimation to measure the effect of new machines on appointments at a hospital. These new machines allow doctors to see more patients.

    I have data from 2000-2015 by doctor and year. Hospital A (my treatment group) received 10 new machines in 2012, 12 in 2013, and 15 in 2014. Hospital B (my control group) received no new machines.

    I would like to isolate the effect of having all 37 of the machines on appointments (but I am open to suggestions if I should be doing something else). If I use 2012 to 2015 as the time binary then I am underestimating the effect of all the machines. If I use 2014 and 2015 then I am comparing 37 machines to zero machines (before 2012), 10 machines (2012), and 22 machines (2014).

    Besides dropping data from 2012 and 2013 is there a way to estimate these effects?

    Code:
    collapse(sum) appointments, by(date hospital)
    generate time = (date >= date("2012", "Y"))
    generate treated = (hospital=="A")
    
    reg appointments time##treated
    Date Doctor Hospital Appointments Treated Time
    2010 0001 A 20 1 0
    2010 0002 A 11 1 0
    2010 0003 B 51 0 0
    2010 0004 B 7 0 0
    2011 0001 A 22 1 0
    2011 0002 A 14 1 0
    2011 0003 B 42 0 0
    2011 0004 B 5 0 0
    2014 0001 A 26 1 1
    2014 0002 A 17 1 1
    2014 0003 B 60 0 1
    2014 0004 B 9 0 1


  • #2
    I would like to isolate the effect of having all 37 of the machines on appointments (but I am open to suggestions if I should be doing something else).
    I don't think this is possible with the data designed in this way. When the last batch of new machines was delivered, they entered an environment in which the hospital staff were already quite experienced at using these machines for a number of years. Whatever "learning curve" issues may have attended their initial introduction in 2012, would be quite attenuated by now. The circumstances wee completely different. (This is true even if the doctors who got the new machines in 2015 never got to use any of the earlier ones: support staff will have worked with them. Logistical issues associated with storage, maintenance, etc., would have been ironed out. Etc.) So all that the data surrounding 2015 can inform you of is the incremental impact of adding 15 new machines in a context where there were already 22 in use.

    What I think you can estimate very nicely is the initial impact of their introduction and the subsequent impact of each increment. Something like this:

    Code:
    collapse (sum) appointments, by(date hospital)
    rename date year
    gen byte era = 0 if year < 2012
    replace era = 1 if year == 2012
    replace era = 2 if year == 2013
    replace era = 3 if year >= 2014
    label define era 0 "No machines" 1 "10 Machines" 2 "22 Machines" 3 "37 Machines"
    label values era era
    gen byte treated = (hospital == "A")
    poisson appointments i.era##i.treated
    margins era#treated
    margins r.era, at(treated = (0 1))
    Note that I'm inclined towards -poisson- rather than -reg- here because your variable is inherently a count, and the values are, in some cases, pretty small, so a normal error structure doesn't seem very suitable.

    Let me also point out that if the data shown are the complete available data, it is really to sparse for heavy-duty analysis. With only 12 observations, even if we stuck with a binary time variable (before vs after 2012, say) there would be 3 variables. With my approach, there are 7! That doesn't leave many degrees of freedom for estimating things. I would hope that there are more hospitals in the real study in both arms, or perhaps more time periods available (perhaps appointment data could be disaggregated to quarterly or monthly observations). If not, you might be better off just graphing the number of appointments as a function of time in each hospital and foregoing fancy statistics.
    Last edited by Clyde Schechter; 29 Oct 2016, 09:46.

    Comment


    • #3
      Clyde,

      I want to sincerely thank you; this is a big help.

      This is just a mock-up of my data. I actually have monthly data by doctor. Once I collapse the data by hospital A and B I go from thousands of obs to 100.

      Thanks again,

      William

      Comment

      Working...
      X