Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Need help with Difference in Difference (DDD)

    I am looking to see the effects of a law that passed in different years in different states. For instance, it passed in TX in 1995 and Passed in NY in 1998. In particular I have 11 treated states and 11 control states, and 11 dates. so I defined "post" dummy to be equal to 1 for each states after the implementation date. I defined all the states that passed the law at any point as treated states dummy equals to 1 and 0 otherwise. I did similarly for the control group. No when I run the model of Yit = B0 + B1 treated i + B2 post t + B3 treated*post + e it the stata omit the B3 treated*post variable.
    I know I am doing something wrong here but i have never run the DD model with multiple dates and many treatment and control groups.
    I really appreciate if any one can help me with general instructions or any helpful codes

  • #2
    So the problem is that you have not properly created the post variable. You did it right for the states that passed the law. But what about the control states? You have to define it for them, too, and if, as I think you have done, you just set it to zero in all years, you will get the (incorrect) results you have gotten.

    In the classic DID design, all of the treated entities begin treatment at the same time. In that situation, you have a single year for adoption, and you define the post variable to be 1 in all years from that date on (regardless of whether the state is a treatment state or not) and zero in all years prior to that date (again regardless of whether the state is a treatment state or not.) You can think of that year as the year in which those states which passed the law did so, and the year in which those who didn't pass the law would have done so if (counterfactually) they had passed it.

    Now, your data don't have a single year that you can use in this way. So what you have to do is impute to each control state a year in which they "would have" passed the law. There are three ways you can approach this. One is to examine the legislative records and find out in which year each state considered the law but voted it down (or tabled it, or whatever they did to not adopt it.) That may not be feasible, or it may be that some or all of the non-adopting states never even considered it. In that case, the next best approach is to match each control state to a treated state based on salient sociological and political and demographic and economic characteristics. Then you pick the "would have enacted" date for the control state to be the actual date of enactment of its matched treated state. Finally, if you do not have and cannot obtain suitable data to use for the matching, then your fallback position is to match each control state with a randomly selected treated state and, again, use the actual enactment date for the matched treated state as the "would have enacted" date for the control.

    With that done, you then define the post variable to be 1 for all years including and after the year of enactment (for treated states) or the "would have enacted" year (for control states), and zero for all years preceding.

    Comment


    • #3
      You might want to consult http://faculty.haas.berkeley.edu/ross_levine/papers.htm: "Big Bad Banks: The Winners and Losers from Bank Deregulation in the United States." (with Thorsten Beck and Alexey Levkov) Journal of Finance, October 2010, 1637-1667. Lead Article.Data Appendix. Data.

      Ho-Chuan (River) Huang
      Stata 19.0, MP(4)

      Comment


      • #4
        Originally posted by Clyde Schechter View Post
        So the problem is that you have not properly created the post variable. You did it right for the states that passed the law. But what about the control states? You have to define it for them, too, and if, as I think you have done, you just set it to zero in all years, you will get the (incorrect) results you have gotten.

        In the classic DID design, all of the treated entities begin treatment at the same time. In that situation, you have a single year for adoption, and you define the post variable to be 1 in all years from that date on (regardless of whether the state is a treatment state or not) and zero in all years prior to that date (again regardless of whether the state is a treatment state or not.) You can think of that year as the year in which those states which passed the law did so, and the year in which those who didn't pass the law would have done so if (counterfactually) they had passed it.

        Now, your data don't have a single year that you can use in this way. So what you have to do is impute to each control state a year in which they "would have" passed the law. There are three ways you can approach this. One is to examine the legislative records and find out in which year each state considered the law but voted it down (or tabled it, or whatever they did to not adopt it.) That may not be feasible, or it may be that some or all of the non-adopting states never even considered it. In that case, the next best approach is to match each control state to a treated state based on salient sociological and political and demographic and economic characteristics. Then you pick the "would have enacted" date for the control state to be the actual date of enactment of its matched treated state. Finally, if you do not have and cannot obtain suitable data to use for the matching, then your fallback position is to match each control state with a randomly selected treated state and, again, use the actual enactment date for the matched treated state as the "would have enacted" date for the control.

        With that done, you then define the post variable to be 1 for all years including and after the year of enactment (for treated states) or the "would have enacted" year (for control states), and zero for all years preceding.
        Thank you so much Clyde for your reply. I have two more questions in case you get the chance to reply. Based on what I understood from your instructions I need to find the control states to define the post variable for them too. Regardless of what steps I should take to define the post implementation date for them, Should I manage states to be two by two? Meaning if my treated state is Texas with 1995 as an implementation year and my control state is California with the same date, I define the post equals to 1 after 1995 if state equals TX or CA. Then on the other case if my treatment states is GA (with the control state of MI) and the year is 1992, I define after 1992 to be 1 if the states are GA or MI and 0 otherwise. So I need to do this for each pair of states with similar characteristics but then after all I only have one variable ‘post’ with different 1 and 0s, correct?
        Also, some of these states have adopted the law at one year and rejected the law after some years. For example, NY adopted the law in 1990 and rejected it in 2003. What is your recommendation if I want to consider these changes on my analysis (for robustness check)? I need to mention that I didn’t include states with this behavior in my original treatment groups.

        Comment


        • #5
          Based on what I understood from your instructions I need to find the control states to define the post variable for them too. Regardless of what steps I should take to define the post implementation date for them, Should I manage states to be two by two? Meaning if my treated state is Texas with 1995 as an implementation year and my control state is California with the same date, I define the post equals to 1 after 1995 if state equals TX or CA. Then on the other case if my treatment states is GA (with the control state of MI) and the year is 1992, I define after 1992 to be 1 if the states are GA or MI and 0 otherwise. So I need to do this for each pair of states with similar characteristics but then after all I only have one variable ‘post’ with different 1 and 0s, correct?
          Correct.

          Also, some of these states have adopted the law at one year and rejected the law after some years. For example, NY adopted the law in 1990 and rejected it in 2003. What is your recommendation if I want to consider these changes on my analysis (for robustness check)? I need to mention that I didn’t include states with this behavior in my original treatment groups.
          This gets complicated and I'm reluctant to advise you. In the simplest case, repealing the law just puts everything back the way it was before it was adopted. But the real world is seldom like that: typically the post-repeal situation differs from both pre-enactment and during the effect of the law. So finding a model that reflects this properly is difficult and depends on a pretty intimate understanding of the effects of both enactment and repeal of the law/policy. It also raises questions whether you need a separate control group for the repeal process, and which states should be part of that. There really are a lot of issues. I think you need to hash that out with an expert in your discipline who is familiar with this situation. If, after you arrive at a modeling approach for this, you can post back here for Stata programming assistance if desired.

          Comment

          Working...
          X