Clustering standard errors in a survey with nested sampling

Rayyan Mobarak

Join Date: May 2020

Posts: 17
#1

Clustering standard errors in a survey with nested sampling

14 Mar 2023, 14:15

Hi everyone,

This is more of a statistical approach question rather than a Stata coding question.

I am using a dataset that was collected by randomly selecting communities (primary sampling units) from the total communities defined in the national census of a certain country. After this, households (secondary sampling unit) were randomly selected from these selected communities. All household heads were then surveyed to capture household-level variables and variables specific to the household head (for example age and gender), while consenting individuals from the household were further surveyed for more individual-level variables.

I am trying to estimate how the household-level variable of income predicts the household-level variable of health safety measures, controlling for other household and individual variables and by using district-level (the largest geographical aggregation in the data) shocks to income as instruments. My question is, at what level do I need to cluster my standard errors? Would clustering at the primary sampling unit (community) suffice or do I need to do it at the household or district levels?

Thank you!
Tags: None
Rayyan Mobarak

Join Date: May 2020

Posts: 17
#2

14 Mar 2023, 19:56

Something that is also worth mentioning is that there were two rounds of the survey in 2014 and 2015. The communities that were randomly selected and surveyed in 2014, were also surveyed in 2015, but different households were randomly selected each time without replacement (the same household cannot appear in the data for both survey years, but the same community does). This suggests to me that clustering at the community level is important but any guidance would be appreciated.
Comment

Announcement

Clustering standard errors in a survey with nested sampling

Comment