Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    HI Richard
    So, you are correct. The formula suggests that PTA holds if the Growth in the control group and the growth in the treatment group (absent of treatment) hold. This, however, cannot be estimated after treatment has been implemented.
    So, the way it is proxied in CSDID (and other DID estimators) is to analyze that change looking at data before the treatment took place.
    Regarding the use of Covariates.
    The way that CS state the model, All characteristics are time invariant, so X_{t}=X_{t-1}=X_{t-k}. So it doesn't matter what period you look at, you are using the same values for X.
    Now, in practice, if you are using panel data, you use the period X_{t-1} to estimate the parallel trends. (assuming t<g)

    Using CS language, Treatment effecs and pretrends are estimated exactly the same way. THey are all called att(g,t) which stands for the Average treatment effect for group G at time T

    when analyzing data "after" treatment (t>=g) the att(g,t) compares outcomes for period t with outcomes in period g-1 (last period without treatment). Here you use X_{g-1}
    If you instead analyze data Before treatment (t<g) the att(g,t) compares outcomes for periods t and t-1. This att(g,t)'s are used for pretretrend test. Here you use X_{t-1}

    what differs across estimators is simply how att(g,t) is estimated.

    HTH


    .

    Comment


    • #17
      Got it. Thank you so much.

      Comment


      • #18
        Dumb question. I cannot figure out how access the coefficients. Here is an example.
        use https://friosavila.github.io/playing...rdid/mpdta.dta, clear
        csdid lemp lpop , ivar(countyreal) time(year) gvar(first_treat) method(dripw) agg(event)
        test T+0
        gives me an error "T ambiguous abbreviation"
        I have tried all sorts of permutations to figure how to do an F test that all the leads (or lags) are significant, and cannot figure out how to do that.

        Comment


        • #19
          hi Richard
          Unfortunately, you cannot use "test" to do something like what you describe right now. I ll try to add an option in a future update.

          There is, however, a trick that you can use.

          Code:
          use https://friosavila.github.io/playingwithstata/drdid/mpdta.dta, clear
          csdid  lemp lpop , ivar(countyreal) time(year) gvar(first_treat) method(dripw) agg(event)
          matrix b=e(b)
          matrix V=e(V)
          
          program addex, eclass
          ereturn `0'
          end 
          
          matrix colname b = t1 t2 t3 t4 t5 t6 t7
          matrix colname V = t1 t2 t3 t4 t5 t6 t7
          matrix rowname V = t1 t2 t3 t4 t5 t6 t7
          addex post b V
          
          test t1
          So the idea is to get the variance and covariances, rename all columns and rows, create a new equation object , and you can test the coefficients with "test"
          HTH

          Comment


          • #20
            Thanks for coming up with a solution & giving an explanation for how your code works.

            Comment


            • #21
              Dear Fernando,

              Thank you very much for this implementation. I am having a blast experimenting with it as it is very clear and straightforward to run.

              I want to ask you how you would recommend estimating heterogeneous treatment effects. From your 2021 Stata Conference slides, I read that "Embrace TE heterogeneity in the same way as teffects does in cross-section setups." However, I am not sure how to implement it.

              Please, would you be able to give an example? As an experiment, I was trying to estimate heterogeneous effects for lpop quantiles from your mpdta database from the help file (even if this does not make much sense).

              Code:
              use https://friosavila.github.io/playingwithstata/drdid/mpdta.dta, clear
              xtile q_lpop = lpop, n(5)
              
              5 quantiles |
                  of lpop |      Freq.     Percent        Cum.
              ------------+-----------------------------------
                        1 |        500       20.00       20.00
                        2 |        500       20.00       40.00
                        3 |        500       20.00       60.00
                        4 |        500       20.00       80.00
                        5 |        500       20.00      100.00
              ------------+-----------------------------------
                    Total |      2,500      100.00
              
              *This gives me the ATT aggregation but no heterogeneous effects. It uses q_lpop as a control variable.
              
              csdid lemp q_lpop , ivar(countyreal) time(year) gvar(first_treat) method(dripw) agg(simple)
              ............
              Difference-in-difference with Multiple Time Periods
              Outcome model  : least squares
              Treatment model: inverse probability
              ------------------------------------------------------------------------------
                           |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
              -------------+----------------------------------------------------------------
                       ATT |  -.0411743     .01142    -3.61   0.000    -.0635571   -.0187915
              ------------------------------------------------------------------------------
              Control: Never Treated
              How should I proceed with the csdid command to compute the heterogeneous effects of q_lpop?

              Thanks a lot!

              Comment


              • #22
                So, 1) those are really Pedro Sant'Anna slides. HE is the one who came up with the estimator, I'm just the interpreter!
                So, you cannot estimate different effects of qlpop. All the heterogeneity comes from the treatment timing and when it is measured.

                If you do only
                csdid lemp q_lpop , ivar(countyreal) time(year) gvar(first_treat) method(dripw)

                It will produce, by default, treatment effects for all groups (those treated in 2004, 2006 and 2007) at all points in time (2004,2005,2006,2007)
                That is the kind of heterogeneity deferred to in that last slide.
                F

                Comment


                • #23
                  Thank you so much for the clarification and the quick response.

                  Please, let me know if you have any hints on approaching heterogeneous effects (by a covariate). The referees have been very demanding in this regard.

                  Thanks again!

                  Comment


                  • #24
                    if q_lpop has enough variation, you can always try to do csdid by subsample
                    xtile qq=q_lpop, n(5)
                    csdid y if qq==1
                    etc

                    Comment


                    • #25
                      When I run csdid, I get the following output

                      ................xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
                      xxxxxx..................xxxxxxxxxxxxxxxxxxxxxxxxxx
                      xxxxxx

                      Eventually, I get estimates, but I am concerned about using them.

                      I assume that the "x" are bootstrap samples where the procedure produced an error. Is there any way to figure out what is causing the error?

                      P.S. : I tried "set trace on" but I did not get anything useful out of it, or maybe I missed what I should have seen.

                      Comment


                      • #26
                        Hi Richard
                        No, the "x" and "." are not bootstrap repetitions. Each dot represents a particular 2x2 DID estimate. If an X appears it usually means that particular ATTGT could not be estimated. Either because of insufficient data.
                        In all those cases, you will see blanks in the basic CSDID output.
                        Let me know if you have other questions
                        F

                        Comment


                        • #27
                          Ok, great! I am surprised however how long it takes for the program to figure out that a particular 2x2 DID could not be estimates.

                          So, for instance, in the output

                          ................xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

                          the string of "xxxx" took a * long * time to estimate.

                          Also, in the csdid help file it states: " Additionally, you may not need ALL periods, requiring only few periods before the first
                          treatment year." It seems to me that by doing that, one ends up with a bunch 2x2 DID that cannot be estimated, and so it would be nice if there was a faster way to get through those.

                          Comment


                          • #28
                            Yes, I agreed.
                            In some cases the X's occur during the logit/ipt step. So perhaps that is what is holding the process up.
                            I ll try to check that up in the code.

                            Comment


                            • #29
                              Thanks. Do you need me to generate a data set that causes these problems?

                              Comment


                              • #30
                                That would be very helpful. Please if you can contact me at [email protected]. and send me a replicable example, i can take a look and see if my guess is correct regarding the problems.
                                Thank you

                                Comment

                                Working...
                                X