Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • sdid, sdid_event and covariates using optimized method

    Hi,

    I am working with a panel dataset, for which the panel identifier is at region*occupational sector level, covering multiple years. The treatment is staggered and there are 2 adoption times while the rest remain untreated throughout. I have an outome of interest - Y
    I would like to estimate an event study using the sdid_event command from the sdid_event package (Ciccia, Clarke & Pailañir), and include a set of covariates that are essentially Occupation× Year fixed effects - to control for time-varying occupational sector shocks.

    From my understanding of both the sdid and the sdid_event package:
    • When I use the covariates(...) option in sdid, without the projected keyword, the algorithm performs a joint optimization. That is, it estimates the covariate coefficients β^\hat{\beta} simultaneously with the unit and time weights, so as to minimize the weighted imbalance in pre-treatment residual outcomes. This is the “optimized” covariate adjustment approach described in Arkhangelsky et al. (2021).
    • When I use the covariates(...), projected option, it instead runs a two-way fixed effects regression of the outcome on covariates only using untreated observations, computes residuals, and then applies SDID to those residuals. This corresponds to the approach proposed by Kranz (2022).
    Now, for my event study, I would like to use the sdid_event command and include the covariates described above. However, I notice that sdid_event currently only allows the projected covariate adjustment method.

    My question is:
    Is it theoretically or computationally possible to implement the "optimized" covariate adjustment within sdid_event? That is, to jointly estimate the covariate coefficients and the weights in the event study setting, as is done in the regular sdid command?
    If anyone has extended or can help in extending sdid_event in this direction or knows whether this functionality already exists, I would really benefit from some help.
    I'm attaching a code that simulates a dataset quite similar to mine. If possible, please help me with an event study estimation for staggered treatment, using the 'optimized' method for including covariates.
    Code:
    clear
    set obs 144  // 8 regions * 3 occupational sectors * 6 years/time periods = 144 obs
    
    // Generate panel identifiers
    gen region_id = ceil(_n / (3 * 6))       // 1 to 8
    gen occ_id = mod(ceil(_n / 6) - 1, 3) + 1  // 1 to 3 (loops every 6 obs)
    gen year = mod(_n - 1, 6) + 1           // 1 to 6
    
    // Label regions for interpretability
    gen region = ""
    replace region = "a" if region_id == 1
    replace region = "b" if region_id == 2
    replace region = "c" if region_id == 3
    replace region = "d" if region_id == 4
    replace region = "e" if region_id == 5
    replace region = "f" if region_id == 6
    replace region = "g" if region_id == 7
    replace region = "h" if region_id == 8
    
    // Create Occupational Sector var as string
    gen occupation = "occ" + string(occ_id)
    
    // Create unique panel id: Region * Occupational Sector
    egen panel_id = group(region_id occ_id)
    
    // Outcome - random 
    set seed 12345
    gen outcome = runiform()
    
    // Generate treatment indicator (Staggered treatment adoption timing- differs between region "e" and region "c", all others are always not exposed to treatment)
    gen post = 0
    replace post = 1 if region == "e" & year >= 5
    replace post = 1 if region == "c" & year >= 3
    egen occtrends = group(occupation year) // occupational sector specific time trends
    tab occtrends, gen(occtrends_) // dummies for occupational sector specific time trends
    
    ***SDID** 
    *No covariates
    sdid outcome panel_id year post, vce(noinference)
    // ATT = -0.10393
    *Optimized method for covariates
    sdid outcome panel_id year post, vce(noinference) covariates(occtrends_*, optimized)
    // ATT = -0.13821
    *Projected method for covariates 
    sdid outcome panel_id year post, vce(noinference) covariates(occtrends_*, projected)
    // ATT =-0.08656
    
    **SDID- EVENT STUDY *** 
    *No covariates
    sdid_event outcome panel_id year post, disag
    // ATT = -0.10393 (identical to sdid w/o covariates)
    *Projected method for covariates (defualt, no other option available as part of package)
    sdid_event outcome panel_id year post, covariates(occtrends_1 occtrends_2 occtrends_3 occtrends_4 occtrends_5 occtrends_6 occtrends_7 occtrends_8 occtrends_9 occtrends_10 occtrends_11 occtrends_12 occtrends_13 occtrends_14 occtrends_15 occtrends_16 occtrends_17 occtrends_18) disag 
    //ATT = -0.08657 (identical to sdid with projected covariates)

    Thanks in Advance,

    Warmly,
    Aadya

  • #2
    I would like to bump this and also point out that I find that sdid_event and sdid give the same overall ATET with no covariates, but sometimes give slightly but noticeably different ATTs with covariates, even when using the projected option.

    In other words, this:

    Code:
    sdid_event rate id year after, vce(placebo)
    sdid rate id year after, vce(placebo) graph
    produces the same ATT

    but this:

    Code:
    sdid_event rate id year after, vce(placebo) covariates(frpl non_white)
    sdid rate id year after, vce(placebo) covariates(frpl non_white,projected) graph
    produces an ATT of .0970592 using sdid_event and 0.09649 using sdid.

    will produce slightly different ATTs for sdid_event vs. sdid. The difference is not great enough to substantively impact an analysis, but I am wondering if anyone can shed light on where this discrepancy might arise in my case.
    Last edited by Kyle Huisman; 08 Jul 2025, 03:58.

    Comment


    • #3
      This is Diego Ciccia, maintainer of sdid_event.Thanks for your interest in sdid_event!

      We have recently updated both sdid and sdid_event. Among the new features, we have included (a) cluster-robust inference for bootstrap, placebo and jackknife inference methods (both in sdid and sdid_event), and (b) optimized method for covariate adjustment in sdid_event.
      You can re-install the packages directly from the Github repository. Both updates will soon be pushed to SSC.

      As for the first question, the optimized method for covariate adjustment now also works in sdid_event, it is the default (as in sdid), and returns the same ATT estimate as sdid.
      Here's an example:
      Code:
      clear
      
      set seed 123456
      local GG = 50
      local TT = 10
      set obs `=`GG'*`TT''
      gen G = mod(_n-1, `GG') + 1
      bys G: gen T = _n
      sort G T
      
      gen D = G >= `GG'/2 & T >= `TT'/2 + 4*(G/`GG')
      gen C = uniform() * (1 + D)
      gen Y = uniform() * (1 + D) + 0.5 * C
      
      sdid_event Y G T D, covariates(C) vce(off)
      sdid Y G T D, covariates(C) vce(noinference)
      As for the second question, the small discrepancy is due to the fact that sdid computes the residuals explicitly via matrix algebra, while sdid_event uses the build-in Stata predict function.

      I hope this helps! Let me know if you have any other questions and/or comments!

      Comment

      Working...
      X