Clustering in csdid: Why is clustering by macroarea inappropriate?

Adriano Ruggiero

Join Date: Jun 2024

Posts: 7
#1

Clustering in csdid: Why is clustering by macroarea inappropriate?

28 May 2025, 04:00

Dear all,

I'm currently estimating a staggered Difference-in-Differences model using the csdid command in Stata, following the Callaway and Sant’Anna (2021) framework. My units of observation are hospitals (org_id), observed over multiple years (years), and some receive treatment in different years (first_treatment).

Initially, I ran the following command:

csdid outcome, ivar(org_id) time(year) cluster(macroarea) gvar(first_treatment) method(drimp) notyet

In this specification, macroarea refers to 4 broad geographical areas in Italy: North, Center, South, and Islands.

However, I received the following critique from a reviewer:
"Clustering by macroarea rather than by ASL or hospital is incorrect. Clustering should be done at the level at which the treatment is assigned. Using macroarea artificially improves the results because you are defining the variation in the standard errors at too aggregate a level."
My understanding is that this critique refers to the fact that:
The treatment is assigned at the hospital (org_id) level.

Clustering at too high a level (macroarea) underestimates standard errors, potentially leading to over-rejection of the null hypothesis.

Indeed, when trying to switch to cluster(org_id), I encountered the error:
org_id may not be both target and by()
And I understand this stems from the fact that ivar() and cluster() are the same variable in combination with method(drimp) (and wboot). My questions are:

Is it possible that the reviewer made a mistake and the methodology used is correct?

Is there a workaround that lets me specify a robust clustering scheme that respects the level of treatment assignment?

I’d really appreciate any clarification or references on best practices in this situation. Thank you in advance!

Best regards,
Adriano
Tags: None
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2204
#2

28 May 2025, 09:42

As a statistical matter, clustering when there are only four clusters should be avoided because the validity of the standard errors relies on having sufficiently many clusters. You can often get away with 30, maybe even 20, clusters if you have some treated units in enough clusters. With G = 4 clusters the standard errors are likely downward biased — probably what the reviewer is alluding to. Was the policy assigned at the region level? Seems unlikely given it’s a staggered rollout.
1 like
Comment
FernandoRios

Join Date: Apr 2014

Posts: 2491
#3

28 May 2025, 15:13

Also
if using cadid with panel, results automatically cluster by ivar
1 like
Comment
Adriano Ruggiero

Join Date: Jun 2024

Posts: 7
#4

29 May 2025, 06:15

Thank you very much for your reply Jeff Wooldridge. The policy was introduced at the national level, but the implementation varied across hospitals over time. I also tried clustering at the regional level (Italy has 20 regions), which would give me more clusters than macroareas. However, I encountered a different but equally important issue: when clustering at the regional level, the parallel trends assumption no longer holds, for this reason I opted for clustering with macroareas.
Comment
Adriano Ruggiero

Join Date: Jun 2024

Posts: 7
#5

29 May 2025, 06:17

Yes, I had the same impression, FernandoRios , that's why I believe my approach may not be entirely inappropriate after all.
Comment
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2204
#6

30 May 2025, 08:24

If the implementation is at the discretion of the hospitals then I would cluster at the hospital level. Making the violation of PT insignificant by clustering at an essentially arbitrary higher level, which cannot be supported statistically, won't convince anyone. I would instead include region x year or macroarea x year fixed effects. This allows unrestricted trends by, say, region.

BTW, this is easy to do in jwdid. Just include the region dummies among x, and Fernando's impressive code does the rest.
Comment
Nursena Sagir

Join Date: Jan 2022

Posts: 27
#7

13 Jun 2025, 04:12

Dear Jeffrey,

I have a related question about your statement below:

Originally posted by Jeff Wooldridge View Post

With G = 4 clusters the standard errors are likely downward biased

I'm analyzing the mental health impacts of a policy implemented in 2015 that affected higher education students but not vocational education students. I have a repeated cross-section dataset, with student observations across different education levels and yearly cohorts (n~500k).

For clustering, I'm using education level × year (2013–2017), resulting in 10 clusters. However, when I run the csdid command, I get a puzzling result: the clustered standard errors are much smaller than the non-clustered ones. Could this be due to having too few clusters as well? Should I then avoid using clustering?

Thank you in advance for your reply!

Best regards,
Nursena
Comment

Announcement

Clustering in csdid: Why is clustering by macroarea inappropriate?

Comment

Comment

Comment

Comment

Comment

Comment