Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Difference-in-Differences reg vs xtreg

    Dear all,

    I read through many discussions concerning difference in differences analyses but have not quite found an answer.

    I have data on several industries and years before and after a law was changed for some of these industries. Thus, I want to do a DID analysis.

    Originally, I want to analyze all years prior and after the law was changed. From reading previous discussions I got the sense that this would be the appropriate DID command:

    Code:
    xtreg DepVar i.year i.treated##i.period, fe robust
    where DepVar is the dependent variable, year the years, treated a dummy denoting which industry is in the treatment group and period a dummy for when the treatment was switched on.

    In a second step I wand to focus just on the two years between which the change was implemented. However, I am wondering which of the following commands is correct:

    Code:
    xtreg DepVar i.treated##i.period, fe robust
    or
    Code:
    reg DepVar i.treated##i.period, robust
    or
    Code:
    reg DepVar i.treated##i.period i.industry, robust
    where industry is a dummy denoting each industry?

    They all give the same results but the standard errors are different. So I don´t know which one to trust.

    I would really appreciate if someone could clear up my confusion!

    Thanks a lot!

    All the best
    Leon

  • #2
    You should use -xtreg-: you still have panel data.

    If you were not specifying the -robust- option, -reg ... i.Industry- would give you the same results as -xtreg, fe-. But -robust- means different things to -reg- and -xtreg, fe-. In -reg-, -robust- gives you an unclustered standard error that is nevertheless robust to heteroscedasticity. In -xtreg, fe-, the unclustered standard error is invalid, so Stata automatically converts it to a clustered standard error, clustering on your panel variable. Since the unclustered standard error is not valid for this kind of analysis, the -reg- results should not be used.

    Comment


    • #3
      Dear Clyde,

      Thank you very much for your fast and detailed response. I´ve read many of your responses in other discussions and they´re always very clear and helpful, thank you!

      In fact, I learned from one of your posts that I should include the i.year variable (year fixed effects) when running the DID on the first regression when including all before / after years. Intuitively this makes sense to me but I do not have a full explanation for it. Could you motivate one if possible?

      Thanks again!

      Best,
      Leon

      Comment


      • #4
        Well, perhaps you learned the wrong lesson from something I wrote. I do not advocate always adding i.year to these models. I advocate it when the outcome variable is one that is subject to yearly shocks that ought to be taken into account. If the outcome variable is stable, then i.year is not needed. If the outcome variable exhibits linear or near-linear trends rather than arbitrary shocks, then including c.year would make more sense. There is nothing automatic. You have to think about the real-world forces acting on your outcome variable and model them accordingly. The point is to make the regression model as similar to the real world data generating process as you can.

        Comment


        • #5
          Ah ok, thank you very much for explaining this Clyde! I thought the i.year is a must-have for the DID model when implementing it on panel data.

          I am still thinking about the rationale for applying the xtreg commang here. Is it just to correct for time-invariant factors within each industry? I started learning the DID model using the slides by Torres-Reyna http://www.princeton.edu/~otorres/DID101.pdf where he uses the reg command in Stata. So including industry-fixed effects (by applying xtreg or -reg... i.industry-) and time fixed effects is an extension to the more basic model as presented in the slides, correct?

          Comment


          • #6
            No, it is not a must-have for the classical DID model. (It is a must have for a generalized DID, but your code is not of that nature anyway.)

            Is it just to correct for time-invariant factors within each industry?
            Well, it's mostly for that, but that is no small thing. That's huge! Also, it's needed to account for the fact that your observations are not independent but are nested within industry. Just specifying cluster robust standard errors does not fully deal with that.

            I'm not familiar with the slides you linked, and, unfortunately, I do not have the time to review them today.

            Comment


            • #7
              Thanks a lot for your continuous feedback Clyde! Your explanations were as always very helpful!

              Comment

              Working...
              X