Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Diff in Diff: DRDID and CSDID

    Dear all
    Thanks to Prof Baum, the commands drdid and csdid are up.
    drdid implements the Doubly Robust Diff in Diff estimators proposed by Sant'Anna and Shao (2020).
    csdid implements the DiD for multiple time periods proposed by Callaway and Sant'Anna (2020)
    Please let me know if you find any bugs, or have questions on how to use the new commands.
    Best wishes
    Fernando

  • #2
    Hello,

    I would like to use csdid in stata, but I am running into some issues:
    Here is my command:

    csdid mean_light i.state, ivar(village_id) time(year) gvar(cohort) method(dripw) notyet reps(20) cluster(district_id)

    I get the error: year> invalid varname
    However, if i do not put cluster(), I have no error and the command runs well,

    Any idea why this could be happening ?
    Thank you in advance for any help,
    Best,

    Julia

    Comment


    • #3
      Hi
      can you share a replicable example with yhe
      same problem?
      if you could share the data itself would be helpful since I’m not sure why would this be happening
      thank you

      Comment


      • #4
        Hi, thank you for your reply. The problem disappeared when I changed the clustering variable (from string to numeric). I am not sure how this is linked to the error message I previously had (year> invalid name) but anyway, it works now!

        Comment


        • #5
          thank you for letting me know!
          I ll add that condition to the next update to avoid this problem
          Best wishes

          Comment


          • #6
            Thank you so much for programming csdid FernandoRios! I started using it recently and it is just great! I have a couple of clarification questions that maybe you can answer.

            1) I have a firm-level panel and estimate the impact of adopting a new technology. In a TWFE setting, I would add industry-by-year FE. Is something like that possible in csdid or is that not sensible?

            2) The firm-level panel is very long (around 40 years). In my previous event studies, I usually capped all periods at -+10 years before / after adoption of a new technology. csdid uses and displays all periods however. Is there a way to cap the estimation at +-10 years? Or to just display +-10 years instead of the full range? Or would that be violating the approach?

            3) I don´t really understand the role of pre-trends in csdid. What does it mean if the pretrend post estimation test is significant? Does an event study using csdid account or correct for that? I also looked at other commands e.g. by de Chaisemartin & d'Haultfoeuille. If I am not mistaken, these commands do not account for pre-trends. So it would be nice to know how csdid is different in this respect.

            Thank you very much in advance if you could answer these questions or point me to relevant resources!

            All the best,
            Leon

            Comment


            • #7
              Hi Leon
              Im glad you are finding the command useful.
              So some clarifications
              1) You shouldn't add both fixed effects in the model. The way CSDID estimates the effects is using all good 2x2 designs. You can read here a simple explanation of how it works https://friosavila.github.io/playing...n_didmany.html
              So because of this design, individual and time fixed effects are already incorporated. (that is why you would type - csdid y, i(firmid) time(year) gvar(treatment_cohort)

              2) If your data is very long, you have to options. 1 you can drop observations above, say 5 periods of the latest treated cohort, and below 5 periods before the treatment cohort. That will save you time.
              regarding estimation and presentation itself, if you are using the asymptotic results, after running csdid you can do:
              estat event, window(-10 10)

              3) Pretrends in csdid is only for testing. they are not used for the estimation. However if post estimation pretrends are significant, it means that DID assumptions do not hold, and you cant really use this method.
              Of course you can argue pretrend holds up to some period, (say it holds for 10 periods before treatment but not 15).

              Hope this helps
              Fernando

              Comment


              • #8
                Dear Fernando,

                Thanks a lot for the explanations!

                Regarding 3: I understand. Behind the question was this Tweet by Peter Nencka (https://twitter.com/peternka/status/1381668050164912129) who uses the csdid approach (but in R) and writes that "For us, the new methods are *not* a robustness check on a standard TWFE model. They fix "errant" pre-trends that we have been thinking about for years and could have torpedoed the project". But I don´t really see how the csdid approach "fixes" errant pre-trends. Is he saying that bc. the approach avoids the "bad" comparisons from twoway FE?

                Also: I managed to produce a graph of an event study using the csdid command. From the help file, I understand that "ATT's are estimated using all periods relative to the period of the first treatment, across all cohorts." Based on that, I don´t really understand why there are coefficients estimated for t-1 AND t+0. Is t+0 the period where treatment begins? In my event studies before, I always had one omitted category (t-1) and expressed all coefficients relative to that. In the csdid_plot I´m not seeing an omitted category, that´s why I´m a bit confused. But probably I am missing something.

                Thanks a lot again!

                All the best
                Leon

                Comment


                • #9
                  Cannot really say much about why Peter says that. But, My understanding on this approach is as you say. Avoiding "bad" comparisons, its possible to obtain a sensible estimate of ATT;s
                  Regarding the graphs.
                  CSDID do a different identification of the event studies.
                  If you are looking at periods AFTER treatment, the effect is measure as:
                  E(DY|t)-E(DY|g-1) (or as you say, using the last period before first treatment)
                  But for periods before treatment it does
                  E(DY|t)-E(DY|t-1)

                  There is no explicit base line or omitted category.

                  Comment


                  • #10
                    Thank you again so much for your help! I think I slowly begin to understand.

                    I was puzzled that there is a coefficient for t-1, but if I understand you correctly this coefficient is using t-2 as base period?

                    And the "g-1" denotes the last period before first treatment for group G?

                    Comment


                    • #11
                      Exactly
                      that is also how e coefficients for the e attgt’s (simple output) are named
                      equation Is the cohort
                      t_t1_t2
                      where t1 is the pre period and t2 the post period

                      Comment


                      • #12
                        Cool, thanks a lot!

                        Comment


                        • #13
                          I looked at the Journal of Econometrics articles, and I don't see those pre-trend coefficient defined there (maybe I missed it). Is there any place with a discussion of where those terms are computed?

                          Comment


                          • #14
                            Hi Richard
                            The article doesnt define the pretend coefficients explicitly. But if you look at pg 5, assumptions 4 and 5 you will see that CS (2020) defines the Parallel trends assumptions, which is where the pre-trend coefficients are obtained..
                            Namely, that the PTA is tested by looking the short term 2x2 DID estimation using only pretreatment data.
                            HTH

                            Comment


                            • #15
                              Sorry, but I am still confused. In Assumption 4, the formula is only for t \geq g - \delta. So, if \delta = 0, only for t \geq g. Also, my understanding is X in that formula is X_{g -1}. So, are the pretrends computed conditional on X_{g-1}, earlier versions of X_t? Finally, three different versions of the ATT effects are used (or, ipw, dr). Are the same formulas used for the pretrends regardless of what the option is selected for the ATT?

                              Comment

                              Working...
                              X