Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Diff in diff setting - question about research design

    Hello,

    I am working on running a diff in diff regression but would like some feedback on how I should specify my variables appropriately given the research setting.

    I would like to study the effect of a policy that was introduced in all states in 1973, but prior to this, 5 states had introduced this policy in 1970. Would this be considered a staggered diff in diff setting? How should I go about defining my treatment variable? Should I set treatment =1 for all the states that legalized in 1970 and then for all states in 1973? Alternatively, can I create a treatment variable called early state, which =1 for the 5 states that legalized prior to 1970?

    And in either case, how would I go about testing pre-trends? I am under the impression that I have to run a regression of my outcome on interactions between the treatment and post period years (excluding the year pre treatment) with state and year fixed effects. Is this a valid way to go about it?

    Many thanks for your help in advance!







  • #2
    Hi Ola,

    Yes this sounds like a staggered diff-in-diff. I imagine you have a panel dataset (by state and year)? If so, I would set treatment = 1 in each year that the new policy is in place; this means =1 for the 5 states in 1970 (and every year after) and then =1 for the other states in 1973.

    There seems to be quite a lot of literature right now about staggered dif-in-dif and that the DiD estimator isn't necessarily what one might expect. I suggest reading this article here and the papers he recommends.

    The parallel trends assumption tests you suggest seems okay (although I haven't used that technique before). In the past I have used the Autor test based on the test of this paper.

    best,
    Rhys

    Comment


    • #3
      Thank you so much Rhys, that's really helpful!

      To answer your question, I have repeated cross sections (CPS data) rather than panel data. I have specified the treatment variable like you suggested.

      I have a few follow up questions:

      1. Does running this specification make sense? I got this suggestion from the following post.

      Click image for larger version

Name:	01_TreatmentDefinition.PNG
Views:	1
Size:	24.1 KB
ID:	1598265


      I get some odd results, so I wonder if something is going wrong?

      Click image for larger version

Name:	01_RegressionOutput.PNG
Views:	1
Size:	47.3 KB
ID:	1598266




      2. From the same post linked in point 1, the suggestion for testing parallel trends in a staggered diff-in-diff setting involves interactions between the treatment variable and pre/post exposure year dummies.

      I would like to confirm my understanding of how the pre/post exposure year dummies should be specified. Does the below make sense?

      Click image for larger version

Name:	01_ParallelTrendsTesting.PNG
Views:	1
Size:	27.8 KB
ID:	1598267





      3. Finally, I am interested in looking at a third level of variation, which in the regular set up would be a triple diff-in-diff. Suppose, in addition to the state and year variation I have, I also observe variation among age groups in some states. To be more specific, minors in some states don't have access to the treatment, while they do have access in other states. How would I go about incorporating this into the staggered (I guess it's also called generalized?) diff-in-diff setting? Since this variation only exists in some states, would I limit my triple diff-in-diff analysis to the states in which minors are given access to the treatment? And finally, how would I go about testing pre-trends in this setting?

      Once again, thank you for your help, I really appreciate it!

      Best,
      Ola
















      Comment

      Working...
      X