Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Staggered difference in difference

    Dear Statalist Community,

    I am studying the impact of the introduction of body cameras worn by police officers on the total number of violent crimes committed by officers to offenders. I am interested in demonstrating that the introduction od the body cam decreases the number of violent crimes.

    I have a panel data with data from 2009 to 2017, obtained by merging 2 datasets, one of Cincinnati and one of Orlando. In Orlando, bodycam was introduced in Nov 2015, while in Cincinnati in Aug 2016. The treatment is the intro of the body cam, the treatment group is Orlando and the control group is Cincinnati. Three periods are relevant:
    - period 0 (Jan 2009 - Oct 2015): the cities haven't introduced the bodycam yet (both groups were not treated);
    - period 1 (Nov 2015 - Jul 2016): Orlando was treated, and Cincinnati was not;
    - period 2 (Aug 2016 - Oct 2017): both were treated (body cam was compulsory in both cities).

    The variables are:
    Tot_incidents = tot number of violent crime per month
    Tot_monthnum = from 1 (Jan 2009) to 106 (Oct 2017)
    Body_cam = dummy that gives 1 if the city has introduced the body cam, 0 otherwise
    After_Nov15 = dummy that gives 1 if Tot_monthnum>= 83 (Nov 2015), 0 otherwise
    Body_cam*After_Nov15 = interaction term

    The normal diff-in-diff is Tot_incidents= b0+b1*Body_cam+b2*After_Nov15+b3*Body_cam*After_No v15 + e. This does not consider period 2, where the bodycam was introduced in Cincinnati.

    I want to do a staggered diff-in-diff, in order to analyse also period 2, and to see the effect when bodycam was introduced also in Cincinnati. I am trying with the code

    xtreg Tot_incidents Body_cam i.Tot_monthnum, fe

    but it does not work.

    Do you have any suggestion? Does it mke sense to analyse also period 2?

    Thanks a lot in advance!!
    Last edited by Vittoria Cerioli; 27 Nov 2018, 09:03.

  • #2
    Well, this data is not going to support a DID analysis of the effect of introducing body cams in Cincinnati at the start of period 2 because at that point there ceases to be any control group you can compare it to. You can do a contrast of the number of incidents in period 2 in Cincinnati with the number of incidents in periods 0 and 1 in Cincinnati, but, of course, that is just a within-city contrast that is unadjusted for what might have happened in the absence of the introduction of body cams. It's better than nothing, but it is not as robust as a DID estimator would be. If it is feasible, I would get data from a third city that did not introduce body cams during the entire study period at all: then you could do a generalized DID analysis.

    One other thought. This subject matter is one I know little about, but how large is the typical number of incidents? If these numbers are typically smaller than, say, 30 in any observation, you might want to use -xtpoisson- rather than -xtreg- for this.

    Comment


    • #3
      The question I raised about number of events was connected to my concerns about -xtreg- vs -xtpoisson- more than to any issues about the use of DID for periods 0 and 1. When the average number of events is sufficiently large, the Poisson distribution is reasonably well approximated by a normal distribution. For small numbers of events, the approximation breaks down, sometimes quite badly. At 50 events per month, a normal approximation should be reasonable, so you can get away with using -xtreg- instead of -xtpoisson- if you prefer.

      That said, since your professor agrees with me that Poisson is really better, it would be worth your while to learn about the Poisson distribution, and Poisson regression. Even if it makes little practical difference in this particular problem, over time you will find that Poisson regression is a very useful thing to have in your toolbox. Not only is it essential for analysis of low-numbered count data, it is also often very useful for situations where you are tempted to log transform your outcome.

      From the narrow perspective of using Stata, you will find that other than substituting the command -xtpoisson- for -xtreg-, very little changes.

      It does sound like you are properly situated to do a DID analysis for periods 0 and 1. I would also say that even though a Cincinnati only analysis of periods 0 and 1 vs 2 is a less robust approach to causal inference, since you have the data it would probably be worth doing that as well. If the results are similar, each analysis would somewhat improve your confidence in the other.

      Comment


      • #4
        This means that there are strong month-to-month variations in the number of incidents that obscure the effects of the body cam intervention in the first analysis. When those are factored out, the body cam "signal" is seen more strongly.

        Comment


        • #5
          Well, the R squared statistics that come out of -xtreg, fe- are not entirely analogous to the ones that come out of an ordinary linear regression, so I wouldn't pay too much attention to that. That said, the number of observations in your current output is 176. What was it in the earlier analysis? Perhaps you ended up with a smaller estimation sample due to missing values in the added variable?

          As for whether it makes sense to us Tot_crimes as a variable, that is a content-related question that is beyond my expertise. As an educated layperson I would say it sounds like common sense would support it, but what do I know about this sort of thing? Nevertheless, I will go out on a limb a bit and say that my statistical intuition is that if you are using the log of the number of violent outcomes as your outcome variable, it might make more sense to use the log of the number of crimes as the covariate. A further statistical intuition I have here is that this might argue more strongly for using a Poisson model and have the number of crimes as the exposure variable in the model. But again, these intuitions would not be as important as real knowledge of this content area, which I do not have. So I think it would be better to discuss this issue with your professor than with me.

          Comment


          • #6
            Dear Clyde,

            "If it is feasible, I would get data from a third city that did not introduce body cams during the entire study period at all: then you could do a generalized DID analysis."

            Lets say you have data for a controll group that never got treated. How would you set up a regression for this, if you wanted to produce an event study plot?
            Since the treated groups have different treatment-years, the controllgroup cant be given a time-line (like t-1, t=0, t+1), hat would match both treatment groups.

            I cant find a practical answer to this question and would much appriciate your thoughts.

            what I want is something like:

            reg y i.treated##i.treatmentyear
            margins, dydx(treated) plot

            But for this I would need to assign "treatmentyears" to the controll.

            Thank you very much!

            Comment

            Working...
            X