Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Difference in difference with policy evolution

    Hello all,

    I was hoping to get some help on an econometric question regarding difference in difference regression. I am studying a maternity leave policy from 2018 that increased the wage replacement rate, however it was amended in 2020 to increase leave duration by 2 weeks as well. I tried using only data from 2015-2019, but there are too few observations to draw any significant results. Is it possible to conduct the DID analysis using years 2015-2024 as long as I control for the "second treatment" in 2020 ? Is it as simple as controlling for year fixed effects using i.year (which I already included in my regression) ?

    EDIT: my treatment group is mothers of infants in California and my two main control groups are mothers of older children in CA and mothers of infants in 3 states that do not have paid maternity leave.

    I can't seem to find much literature on DID with policy changes during the post-period (I also don't know what this type of DID variation would be called) and I am not exactly sure how to go about doing it. Any advice would be greatly appreciated. Thank you.
    Last edited by Iman Haupricht; 21 Jul 2025, 21:17.

  • #2
    Here is a clearer explanation of what I am doing:

    I’m working with pooled cross-sectional Current Population Survey data on California’s Paid Family Leave (PFL) program and need guidance on modeling a difference-in-differences (DiD) setup where the policy was introduced in one year and modified 2 years later. Specifically:

    AB 908 (effective Jan 2018) increased wage replacement rates

    SB 83 (effective July 2020) expanded PFL duration from 6 to 8 weeks

    My treatment group is mothers of infants in California, and control groups vary depending on age/region (one is California mothers of older children and another is mothers of infants in 3 other comparable states that do not have PFL). Treatment eligibility did not change over time.

    I would have simply excluded the years after the second policy change (SB 83) so only 2015-2020, however this causes my model to lose a lot of statistical power as there are few observations per year. I was wondering if there is a way to control for this policy change in 2020 or even separate the two effects and have estimates for both.

    Thanks for reading and I would really appreciate any help I can get.

    Comment


    • #3
      In a conventional DID analysis you would have a treatment vs control group indicator variable and a pre-post time indicator variable. Here, instead of a simple pre-post time indicator your variable would have three levels: 0 = before 2018, 1 = from 2015 through June 2020, 2 = July 2020 and after. Let's call that variable era. Then your basic analysis becomes:
      Code:
      regress outcome i.group##i.era
      The output will include both 1.group#1.era and 1.group#2.era terms, and these will be the DID estimates of the treatment effect of the original intervention and the modified intervention, respectively. If you are interested in the incremental effect of the change made in 2020, -lincom 1.group#2.era - 1.group#1.era- will give you that.

      Now, it is not what you are asking about, but you might also consider expanding the group variable to multiple levels, because it seems to me that your control groups are pretty heterogeneous and might well respond differently to the interventions. The circumstances that mothers of older children face are different from those caring only for infants, and their responses to economic incentives may well differ. So I might expand group to be a three level variable: 0 = mothers of infants in other states, 1 = CA mothers of older children, and 2 = CA mothers of infants. In this case 2.group will be the actual treatment group, and 0 and 1.group will be the two control groups.

      Accordingly, the outputs of interest will be more complicated: the differences between 2.group#1.era and 0. or 1.group#1.era would be the DID estimates of the effect of the original treatment compared to the two control conditions, and the difference between 2.group#2.era and 0. or 1.group#2.era for the effect of the modified treatment compared to the two control conditions. On the other hand, as you are already struggling with sample size limitations, further chopping up the data like this is going to exacerbate those problems. I am not an economist, nor a labor analyst, so I have nothing better than lay intuitions to offer about this, but those lay intuitions tell me that you would be better off replacing the CA mothers of older children group with one or more additional states that don't have any intervention so that you would have a single more homogeneous treatment group. But you should consult somebody with professional knowledge about this substantive question rather than relying on my intuition that mothers of older children are not a very suitable control group.

      Comment


      • #4
        Hello Clyde, I appreciate the time you took to read and answer! I am going to try to implement the era variable and will update once I have. As for the control groups, I think it was not clear but they are actually already separate, I am just using multiple for more robust results but they are going to be used in separate regressions. You are right in that it would be too heterogenous if combined into a single control group. I apologize for the confusion.

        Before realizing the SB 83 overlap I’d been using Donald & Lang (2007) (explained below) to handle inference with just one treated state. Now I’m running the DiD regression where both the treatment and control group live in California (e.g. mothers of infants vs. mothers of older children).

        Since I no longer have multiple state clusters and the Donald and Lang method might no longer be valid:
        1. What’s the best way to cluster or otherwise adjust standard errors in this within–CA setting?
        2. Would you recommend using wild‐bootstrap, or simply switching to robust SEs or some other method?
        3. And, in this context, is there still a role for Donald & Lang’s aggregation approach, or should I revert to a classic DiD inference strategy?
        I'm not sure if you are familiar with the Donald and Lang two-step method so here it is breifly explained in my context. This method is used when there are very few clusters (like a single state). In the first step, each outcome (i.e. maternity leave) is regressed on the full set of controls, survey year dummies, and the year dummies interacted with treatment status, with no constant. In the second step, the data are collapsed to 10 survey-year cells, and the coefficient on the interaction between treatment status and the year is regressed on an indicator for post-2018 in a regression that is weighted by the sum of the March CPS Supplement person weights in each year.

        --> however, I do not understant how to use this method when there are multiple eras, so I was curious if there's a better way to cluster for the within CA groups.

        Thank you for your help

        Comment


        • #5
          I don't know enough about the possible sources of dependencies and sources of heteroskedasticity in your data to advise you about how to model the vce. Perhaps somebody else can step in here.

          Comment


          • #6
            No problem, thank you Have a good day

            Comment


            • #7
              I just realized I read something wrong in your reply, Clyde:

              Here, instead of a simple pre-post time indicator your variable would have three levels: 0 = before 2018, 1 = from 2015 through June 2020, 2 = July 2020
              Is era1 meant to be 2018-July 2020 instead of 2015-July 2020 ? If it is not an error, I think I do not quite understand the method.

              Comment


              • #8
                Yes, sorry, that should be 2018-July2020.

                Comment


                • #9
                  Great thanks !

                  Comment

                  Working...
                  X