No announcement yet.
  • Filter
  • Time
  • Show
Clear All
new posts

  • Diff-in-Diff with multiple treatment periods and treatment magnitudes?

    I have a question regarding how to run a generalized diff-in-diff with treatments that are implemented at different times and with different intensities (continuous rather than binary treatments).

    Suppose I have monthly data over the last 10 years indicating how many flu vaccines a government program provided in 100 cities. Over this 10 year period this program was the only flu vaccine provider in 35 of these cities. However, in 33 of these cities there was a single, competing commercial provider for this entire 10 year period, and I have data on how many vaccines that company provided by month as well. This company also opened up vaccine clinics in the remaining 32 cities at different points over the 10 year period (and I have data on the number of vaccines they provide each month once opened).

    I want to estimate how much the private provision of vaccines displaces public provision.

    The data has both a panel element (city) and time series element. I believe I understand how to approach this using diff-in-diff if I simply consider the opening of a private clinic as a binary treatment variable. To do this I create a dummy variable called treat, which is 1 in for the cities that receive the treatment of having a private clinic open up at sometime during the 10 year period (and is 1 in those observations at all times, including before treatment started) and 0 in all observations for the untreated groups (the cities that have never faced any vaccine competition). Then I create another dummy variable called activetreat which is 1 in the treatment group after treatment begins, but is 0 in the treatment group before treatment begins and is 0 in all observations in the control group. Then, something like:

    xtset city

    xtreg government_vaccines i.treat i.activetreat i.time, fe

    However, this approach effectively throws out data from the 33 cities that always had competition and it does not take into effect the magnitude of the treatment, namely the number of vaccines that private clinics administer, which can vary by time and city (e.g. it can take few years to ramp up operations).

    I'm aware of a few papers that have done something like this ( Acemoglu, Autor and Lyle (2004) ) but it's not clear to me how to operationalize this approach in Stata.

    Can anyone provide any guidance regarding how to run generalized diff-in-diff with treatments that are implemented at different times and with varying magnitudes over time?

  • #2
    After some more searching, would the following be an appropriate way to specify this?

    xtreg government_vaccines i.treat c.ctreat i.year, fe

    Where ctreat is the number of vaccines provided by the private clinic (which is 0 for every month/city that has no private clinic)?


    • #3
      I wouldn't endorse either of the specifications you have shown.

      Your treat variable needs to be a continuous variable representing the intensity of the intervention (and 0 when there is no intervention). Whether representing treatment intensity by number of vaccines is the best specification is a substantive question I won't advise you on. At least for the purposes of moving to code, let's assume it is.

      Then you need your activetreat variable which is 0 for all observations for cities that never got "treated." And for cities that eventually get treated it is 0 for observations preceding treatment, and 1 for observations at or after the start of treatment.

      The key ingredient in the diff-in-diff analysis, missing from your formulations, is the interaction term. So your code should actually look like something this:

      xtset city year
      xtreg government_vaccines i.year i.activetreat#c.treatment, fe // N.B. #, NOT ##
      Another issue you may want to consider is whether -xtreg- is appropriate here, as the outcome is a count variable. So you might want to consider -xtpoisson- instead. I don't have strong feelings about this either way, but if the conditional outcome distribution really is Poisson-like, -xtreg- will be left with a great deal of heteroskedasticity.


      • #4
        Thank you for your reply, which is very helpful.

        The code that you suggest actually yields the exact same result as the code in my second post, likely because i.activetreat#c.treatment is equivalent to c.treatment .

        However, what puzzles me is that if I model this as a binary treatment (starting the year a private clinic opens) with the code in the first post, the result shows that the opening of a private clinic slightly decreases government vaccines. But if I model this as a continuous treatment, the result shows the that the opening of a private clinic slightly increases government vaccines. Same if I use -xtpoisson-. Conceptually, using the continuous treatment should provide a more meaningful result, and I suppose it possible that the sign change is an artifact of having a better measure of the treatment. But it also make me worry that I haven't properly executed one of the approaches. Should I be worried by the sign change?


        • #5
          Yes, you should be worried about this. It suggests that the effect of the amount of vaccine provided by private clinics is non-linearly related to your outcome. Perhaps small amounts result in a slight increase in outcome (maybe the private clinics do some marketing that increases awareness and demand for vaccine, some of which goes to the government clinics) but a large amount siphons patients away from the government clinics. I think the first step here is to explore graphically the relationship between the amount of privately provided vaccine and the amount of government provided vaccine. You may then want to look into a more complicated specification of the treatment variable using some transformation, or perhaps splines.