Dear all,
I'm currently estimating a staggered Difference-in-Differences model using the csdid command in Stata, following the Callaway and Sant’Anna (2021) framework. My units of observation are hospitals (org_id), observed over multiple years (years), and some receive treatment in different years (first_treatment).
Initially, I ran the following command:
csdid outcome, ivar(org_id) time(year) cluster(macroarea) gvar(first_treatment) method(drimp) notyet
In this specification, macroarea refers to 4 broad geographical areas in Italy: North, Center, South, and Islands.
However, I received the following critique from a reviewer:
Best regards,
Adriano
I'm currently estimating a staggered Difference-in-Differences model using the csdid command in Stata, following the Callaway and Sant’Anna (2021) framework. My units of observation are hospitals (org_id), observed over multiple years (years), and some receive treatment in different years (first_treatment).
Initially, I ran the following command:
csdid outcome, ivar(org_id) time(year) cluster(macroarea) gvar(first_treatment) method(drimp) notyet
In this specification, macroarea refers to 4 broad geographical areas in Italy: North, Center, South, and Islands.
However, I received the following critique from a reviewer:
"Clustering by macroarea rather than by ASL or hospital is incorrect. Clustering should be done at the level at which the treatment is assigned. Using macroarea artificially improves the results because you are defining the variation in the standard errors at too aggregate a level."My understanding is that this critique refers to the fact that:
- The treatment is assigned at the hospital (org_id) level.
- Clustering at too high a level (macroarea) underestimates standard errors, potentially leading to over-rejection of the null hypothesis.
org_id may not be both target and by()And I understand this stems from the fact that ivar() and cluster() are the same variable in combination with method(drimp) (and wboot). My questions are:
- Is it possible that the reviewer made a mistake and the methodology used is correct?
- Is there a workaround that lets me specify a robust clustering scheme that respects the level of treatment assignment?
Best regards,
Adriano
Comment