Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to run new event study with TWFE tool (e.g. csdid, did_imputation, etc.) with an interaction term

    Hello,

    I want to run one of the new TWFE event study packages including an interaction term, say for quantifying a margin of heterogeneity.

    Say I have an event study of some outcome y in response to an event, for both men and women. I want to estimate the differential effect of the event for men and women, and conduct inference on the interaction coefficients. Before the advent of the new TWFE DiD tools, I would have run something like:

    reghdfe y i.event_time##b0.i.female, cluster(id) absorb(id time)

    (of course with event_time re-leveled so as not to have negative categorical variables, which Stata can't accommodate)
    My questions:

    1. How would I do this now with csdid or any of the other new TWFE DiD packages?

    2. Is there a way to accommodate this with seemingly unrelated regression?

    3. Can I do this without bootstrapping? It just takes a LONG time with a never treated group, and I would imagine bootstrapping 100 times would take like three months.

    4. If I were to do this by stratifying the regressions by men and women and manually computing the difference, could I impose independence of the errors and just compute the standard errors on the difference coefficients as the square root of the sum of the squared SEs? Would anyone actually believe those SEs that impose independence?

    5. Could it possibly be the case that I could argue that I could just run the old-fashioned TWFE event study regression with the interaction term as above and argue that for performing inference on the interaction term, the treatment effect heterogeneity could "net out"? This argument already seems pretty hand-wavy, but perhaps there is a reference that has already argued this more rigorously.
    I have scoured the internet and Chat GPT, but to no avail. I would greatly appreciate anyone's experience and expertise here!

    ____________

    Here is some sample code that I have generated just to illustrate:

    clear
    set obs 1000
    gen id = _n
    gen female = runiformint(0, 1)
    gen event_year = runiformint(2008, 2013)
    expand 20
    gsort id
    bys id: gen year = 2003 + _n
    gen event_time = year - event_year

    //simple DGP
    gen y = rnormal(0, 5) + 10*(event_time >= 0) + 4*(event_time >= 0)*female

    //binning
    replace event_time = -5 if event_time <= -5
    replace event_time = 10 if event_time >= 10

    //re-leveling event time so as to avoid negative categorical variables
    sum event_time
    gen timing = event_time - r(min)
    sum timing if event_time == -1
    loc base = r(mean)

    reghdfe y b`base'.i.timing##b0.i.female, cluster(id) absorb(id year)

    Following this DGP and using the above procedure, I want to recover a set of event_time#female coefficients around 4, and ideally also recover common event time coefficients of 10.


  • #2
    Let me immodestly propose the extended TWFE estimator, which shows all of the moderating effects. You can implement it with Fernando Rios-Avila's jwdid command, and the output shows the ATTs as well as all of the interactions. You can also do it "by hand." You can see lots of examples on the shared Dropbox pinned at my Twitter (X) account.

    If you want the so-called leads and lags version, append the "never" option to jwdid.

    In Stata 18, xthdidregress twfe does the same estimation but does not report the interaction terms as far as I know. JW

    Comment

    Working...
    X