Background:
I have a Differences-in-Differences (DiD) design where a policy (subsidised primary healthcare) was rolled out to different age-groups (k=4) in different years. Like many DiD designs, my model faces the issue of individual-level outcomes (unmet need for a doctor in past 12 months; binary) and group-level treatment, which means there will be (1) correlations within age*year clusters, and (2) serial correlation between age-groups across years. That is, treatment varies by age-group g and time t but not individual i
Ie.
UNMETigt = αi+ αg + αt + γgt + βTREATgt + δCOVARSigt + εigt
While I haven't yet obtained access to the panel data, I imagine my code might start from a basis of something like:
Currently, this code does not account for the fact that treatment does not vary by i. I am aware that clustering by age*year will bias least-squares standard errors due to serial correlation for age-groups across years. I am also mindful that clustering solely by age-group to remove the time component will only leave me with four clusters, which is insufficient to estimate the correlation without bias. The literature I have consulted appears to cast doubt on the suitability of Stata's cluster option given serial correlation and few clusters (ie, Angrist & Pischke, 2008)
Question:
What are my best options for handling group-level correlations in StataSE 14.2?
(1) One option in the literature is to correct the standard errors for first-order serial correlation at the group-level, using an AR(p) structure along the lines of Hansen (2007). I haven't seen any Stata modules that do this, but does anyone else know of one or something similar?
(2) Another option I've seen is block bootstrapping, which bootstraps the SEs by age-group akin to Cameron, Gelbach & Miller (2008). Would this involve simply specifying the option: vce(bootstrap, cluster(age)), or have I misunderstood?
(3) Is there a better approach I'm missing?
Apologies if I've missed something self-evident! I'm new to this form of analysis.
Refs:
Angrist, J. D., & Pischke, J. S. (2008). Mostly harmless econometrics: An empiricist's companion. Princeton university press.
Cameron, A. C., Gelbach, J. B., & Miller, D. L. (2008). Bootstrap-based improvements for inference with clustered errors. The Review of Economics and Statistics, 90(3), 414-427.
Hansen, C. B. (2007). Generalized least squares inference in panel and multilevel models with serial correlation and fixed effects. Journal of econometrics, 140(2), 670-694.
I have a Differences-in-Differences (DiD) design where a policy (subsidised primary healthcare) was rolled out to different age-groups (k=4) in different years. Like many DiD designs, my model faces the issue of individual-level outcomes (unmet need for a doctor in past 12 months; binary) and group-level treatment, which means there will be (1) correlations within age*year clusters, and (2) serial correlation between age-groups across years. That is, treatment varies by age-group g and time t but not individual i
Ie.
UNMETigt = αi+ αg + αt + γgt + βTREATgt + δCOVARSigt + εigt
While I haven't yet obtained access to the panel data, I imagine my code might start from a basis of something like:
Code:
xtset id year xtreg unmet L.i.age##L.i.year + L.treat + L.covars
Question:
What are my best options for handling group-level correlations in StataSE 14.2?
(1) One option in the literature is to correct the standard errors for first-order serial correlation at the group-level, using an AR(p) structure along the lines of Hansen (2007). I haven't seen any Stata modules that do this, but does anyone else know of one or something similar?
(2) Another option I've seen is block bootstrapping, which bootstraps the SEs by age-group akin to Cameron, Gelbach & Miller (2008). Would this involve simply specifying the option: vce(bootstrap, cluster(age)), or have I misunderstood?
(3) Is there a better approach I'm missing?
Apologies if I've missed something self-evident! I'm new to this form of analysis.
Refs:
Angrist, J. D., & Pischke, J. S. (2008). Mostly harmless econometrics: An empiricist's companion. Princeton university press.
Cameron, A. C., Gelbach, J. B., & Miller, D. L. (2008). Bootstrap-based improvements for inference with clustered errors. The Review of Economics and Statistics, 90(3), 414-427.
Hansen, C. B. (2007). Generalized least squares inference in panel and multilevel models with serial correlation and fixed effects. Journal of econometrics, 140(2), 670-694.
Comment