Dear Statalist,
we ran into a problem that concerns the inclusion of (appropriate) clustered standard errors in a multilevel regression model.
Data: Our data is unbalanced pooled cross-sectional data (i.e., not panel data). Individuals were surveyed over 10 years across 90 countries (total number of individual observations: ~1.0m). Not every country participated in each year. The respondents per country per year are randomly sampled. The table below illustrates our heterogeneous data. Our DV is binary. We have a rich set of controls at the individual level and the country level.

Analysis: Because our individuals are nested in countries, we perform a multilevel logistic regression using the following command in Stata 17:
melogit DV IV individual_level_controls country_level_controls year_dummies || country:
We were asked to additionally include clustered standard errors (vce (cluster country)). We did not include this option right away as we thought that the multilevel structure accounts for the fact that observations within each country are not independent. Also, published studies in our field using a similar setup sometimes include clustered standard errors, and sometimes do not.
Problem: If we include the vce(robust) command after our melogit || country: command, the significance of our IV changes drastically. (from a p-value of 0.00 to 0.30-40).
Way forward: We are looking for any suggestions on how to move forward. That is, should we include clustered standard errors or not? We also read the recent paper by MacKinnon et al. (2023) (https://www.sciencedirect.com/scienc...4407622000781), which discusses the issue and states that clustered SE are sometimes too conservative, especially if clusters are very heterogeneous. The paper suggests the use of a wild cluster bootstrap (implemented in STATA via boottest). However, the command does not work after melogit, and the paper seems to be written with linear models in mind in general.
We would be very happy about some recommendations on how to proceed.
we ran into a problem that concerns the inclusion of (appropriate) clustered standard errors in a multilevel regression model.
Data: Our data is unbalanced pooled cross-sectional data (i.e., not panel data). Individuals were surveyed over 10 years across 90 countries (total number of individual observations: ~1.0m). Not every country participated in each year. The respondents per country per year are randomly sampled. The table below illustrates our heterogeneous data. Our DV is binary. We have a rich set of controls at the individual level and the country level.
Analysis: Because our individuals are nested in countries, we perform a multilevel logistic regression using the following command in Stata 17:
melogit DV IV individual_level_controls country_level_controls year_dummies || country:
We were asked to additionally include clustered standard errors (vce (cluster country)). We did not include this option right away as we thought that the multilevel structure accounts for the fact that observations within each country are not independent. Also, published studies in our field using a similar setup sometimes include clustered standard errors, and sometimes do not.
Problem: If we include the vce(robust) command after our melogit || country: command, the significance of our IV changes drastically. (from a p-value of 0.00 to 0.30-40).
Way forward: We are looking for any suggestions on how to move forward. That is, should we include clustered standard errors or not? We also read the recent paper by MacKinnon et al. (2023) (https://www.sciencedirect.com/scienc...4407622000781), which discusses the issue and states that clustered SE are sometimes too conservative, especially if clusters are very heterogeneous. The paper suggests the use of a wild cluster bootstrap (implemented in STATA via boottest). However, the command does not work after melogit, and the paper seems to be written with linear models in mind in general.
We would be very happy about some recommendations on how to proceed.
Comment