I am running a difference-in-differences analysis to estimate the association between a state-level policy and a set of outcomes using individual-level survey data. To calculate standard errors, I want to be able to account for survey weights, the state-level implementation of the policy (clustering by state), and the fact that I have a small # of clusters (15 states total). I've tried a couple of approaches and am having trouble combining commands in STATA to do what I think is necessary.
Note: The analysis is on a subgroup of the total survey sample. The variable 'sample' is an indicator for whether an individual is included in the study sample.
1) SEs calculated using svy (according to instructions provided by the CDC):
I'd appreciate any advice on this problem more generally and STATA commands that would help me in this situation. One road I haven't gone down yet is randomization inference - would also appreciate if there are any helpful packages out there to implement this in a svy context.
Note: The analysis is on a subgroup of the total survey sample. The variable 'sample' is an indicator for whether an individual is included in the study sample.
1) SEs calculated using svy (according to instructions provided by the CDC):
svyset _n [pweight=wt], strata(sud_nest) fpc(totcnt)2) SEs calculated using reg, cluster with weights:
global did postXtreat i.state i.yy
global covariates age race etc.
svy, subpop(sample): reg outcome $did $covariates
reg outcome $did $covariates if sample==1 [pw=wt], cluster(state)
3) Wild cluster bootstrap SEs (to address small # cluster problem) ***can't implement this****
- This gives larger SEs than option #1 as expected with clustering
- However, one concern I have is using an "if" statement which is not recommended for subsetting survey data (and can result in incorrect/often inflated errors)
clustse reg outcome $did $covariates if sample==1 [pw=wt], cluster(state) method(wild) reps(500)[INDENT=2]- multiple errors occur: no "if" allowed, no weights allowed, factor variable operators not allowed[/INDENT]
I'd appreciate any advice on this problem more generally and STATA commands that would help me in this situation. One road I haven't gone down yet is randomization inference - would also appreciate if there are any helpful packages out there to implement this in a svy context.
Comment