Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Using csdid for repeated cross-section and single treatment timing

    I am using the -csdid- command for DID regression.

    -ssc install drdid-
    -ssc install csdid-

    Details of my data:
    -Repeated cross section for 2010--2022 with gap of 2 years (except between 2018 and 2022 when the gap is 4 years).
    -Treatment is given all at once in 2018 at the district level.
    -Data is at individual level (repeated cross section): if you are in a treated district, I assume you are a treated individual.
    -I am looking at individual-level outcomes

    I was earlier using -didregress- but I kept getting errors in running pretrends/trendplots subcommands. I set up my did like this:

    gen treated=1 if selected_district==1 //identifying treated units
    replace treated=0 if selected_district!=1

    gen treatpost=0
    replace treatpost =1 if year>=2018 & treated==1 //interaction between treatment and time

    didregress (outcome) (treatpost), time(year) group(distID) vce(cluster distID)
    estat pretrends
    estat trendplots


    The error said "Treatment times vary" even though treatment was given all at once. I figured the problem was that the sample size in 2022 was larger than in 2018. As a result there were more "treated" units in 2022 than in 2018: I took it to mean that Stata considers this as the treatment being staggered. Using -csdid- solved this problem, and the command also ran much faster.

    My questions:
    1. Is csdid valid for single treatment timing as well?
    2. Are there any papers which show that csdid works with single treatment in repeated cross section settings?
    3. If answer to 1 is no, how can I tweak my didregress to avoid the staggered DID problem? I have used reghdfe also but I prefer using a built-in command as it makes getting parallel trend plots/tests easier.
    Last edited by Parul Gupta; 21 Jun 2025, 02:14.

  • #2
    If everyone was treated at the same time, you cant use csdid
    Even with a single treatment time you need a comparison group or never treated (or not yet treated)

    Comment


    • #3
      Parul: Let me recommend a more straightforward strategy, based on this paper with my coauthors. As it turns out, Fernando's jwdid produces the estimates described in this paper in the case of repeated cross sections:

      Code:
       jwdid y x1 ... xk, tvar(year) gvar(first_treat) never
      Then you can also use the testing and graphing capabilities.

      I took your setup to mean that some districts are untreated, and therefore you can use csdid or jwdid. But I think the latter is more straightforward for these purposes.

      Comment


      • #4
        FernandoRios I should have been clearer. About 25% of all districts are treated in 2018. Others are never treated. Treatment is ongoing. Does csdid still apply (even with non-staggered treatment timing)?

        Jeff Wooldridge Professor, thanks for the suggestion! Some of my dependent variables are binary, so jwdid will be useful. Please advise on how to construct the gvar variable. This is what I am doing:

        Code:
        gen treat=0 
        replace treat=1 if selected_dist==1 //this identifies the districts that received treatment. The remaining never received treatment.
        
        gen gvar =0
        replace gvar = 2018 if treat==1 // treated districts received treatment in 2018. All treatment was given together. 
          
         jwdid outcome, tvar(year) gvar(gvar) never
        To reiterate, my key question is whether this is applicable for non-staggered treatment timing with cross-sectional data. Further, should I expect large differences in results with respect to -didregress- or -reghdfe- estimation? Thanks so much for your input!

        Comment


        • #5
          Yes, it is appropriate

          Comment


          • #6
            Thanks, is the gvar construction correct?

            Comment


            • #7
              Yes. Gvar is like your "treated not treated group" but with the year of treatment instead of "1"

              Comment


              • #8
                Great, thanks so much.

                I have a related question: I tried using didregress and csdid on dummy data to check if both give the same result (for single treatment timing). I didn't find that they do. What could be the reason for this?

                I used the data from this link:
                https://libguides.princeton.edu/stat...apper-37890613 (Example 4)

                Comment

                Working...
                X