The point estimates for xthdidregress can be obtained by subtracting each unit's outcome in the base period from the treated period, and applying teffects estimators to this transformed outcome. (At least, IPW results can be replicated as such.) However, assume we have only one observation per-unit across a number of years. There is a treatment status indicator for observations in earlier years prior to the treatment, but we don't observe these pre-treatment observations again in the post treatment period. How does hdidregress ipw work in this context? Can anyone give me an example point-estimate replication using logit to calculate propensity scores and regress with IPW weights?
Let me give a concrete example. Say we are trying to estimate the effect of a scholarship on university attendance of students within a district by comparing students who are eligible for the scholarship or would have been if it was in place to ineligible students before and after the program began. Say it began in 2023. The treatment is a scholarship program going into effect, each unit of observation is an individual high school graduate's college attendance outcome within six-months for the graduating cohorts of 2016 to 2024, and treated and untreated observations are grouped based on whether graduating students were eligible post-treatment or would have been eligible pre-treatment if the program had been in effect. I tried replicating the htdidregress atet for the class of 2023 using the following code:
But this isn't quite correct. I actually get an even closer result to htdidregress if I include both years in the 2x2 comparison in the logit, but they still do not exactly match.
Ignore the fact that this is not really a staggered treatment timing case. I am using the command as a convenience tool to incorporate a selection model into the regression. I want to do this replication so that I can check covariate balance diagnostics for each ATET estimate. Can the ATETs be replicated using basic commands here? What am I missing?
Sorry that I cannot give a data example for confidentiality reasons.
Let me give a concrete example. Say we are trying to estimate the effect of a scholarship on university attendance of students within a district by comparing students who are eligible for the scholarship or would have been if it was in place to ineligible students before and after the program began. Say it began in 2023. The treatment is a scholarship program going into effect, each unit of observation is an individual high school graduate's college attendance outcome within six-months for the graduating cohorts of 2016 to 2024, and treated and untreated observations are grouped based on whether graduating students were eligible post-treatment or would have been eligible pre-treatment if the program had been in effect. I tried replicating the htdidregress atet for the class of 2023 using the following code:
Code:
preserve gen treat_2023 = inlist(graduate_year, 2022, 2023) & eligible_dummy == 1 gen control_2022 = inlist(graduate_year, 2022, 2023) & eligible_dummy == 0 keep if treat_2023 | control_2022 gen treated = 0 replace treated = 1 if graduate_year == 2023 logit eligible_dummy i.FRPL i.sex i.racex if graduate_year == 2022 predict pscore, pr gen ipw = . replace ipw = 1/pscore if treat_2023 == 1 replace ipw = 1/(1 - pscore) if treat_2023 == 0 replace ipw = 1 if treat_2023 == 1 drop if graduate_year < 2022 reg attend_uni i.treat_2023##treated [pw = ipw], vce(robust)
Ignore the fact that this is not really a staggered treatment timing case. I am using the command as a convenience tool to incorporate a selection model into the regression. I want to do this replication so that I can check covariate balance diagnostics for each ATET estimate. Can the ATETs be replicated using basic commands here? What am I missing?
Sorry that I cannot give a data example for confidentiality reasons.
Comment