Dear Statalister,
I encounter with a problem about clusterting. I hope to receive your advice on it.
I am examining the effect of a policy on spousal earnings using two-way FE (occupation and time) with a continuous treatment. Since the policy affects both couples at the same time, my equation is like this
log(earnings) of wife = a1* eligility of husband + a2* eligibility of wife + a3* occupation of husband + a4* occupation of wife + a5* year + other controls
Clustering at the level of husband's occupation or clustering at the level of wife's occupation gives me the same estimate, but very different standard errors.
Since my main interest is a1, should I cluster at husband's occupation? Is that okay to use clustering at husband's occupation?
Or I am thinking about creating a composite category variable based on the occupation of both husband and wife of the year before treatment, following the guide of Nick Cox (https://www.stata.com/statalist/arch.../msg00095.html)
egen both_occ = group(husband's occupation wife's occupation), label
Do you think clustering at "both_occ" is more appropriate than using either husband's occupation or wife's occupation?
Another thought about using group cluster "both_occ" is because if I use either husband's occupation or wife's occupation, total numbers of cluster is below 50. And I read somewhere that the rule of thumb for total number of cluster should be 50.
Thanks in advance.
I encounter with a problem about clusterting. I hope to receive your advice on it.
I am examining the effect of a policy on spousal earnings using two-way FE (occupation and time) with a continuous treatment. Since the policy affects both couples at the same time, my equation is like this
log(earnings) of wife = a1* eligility of husband + a2* eligibility of wife + a3* occupation of husband + a4* occupation of wife + a5* year + other controls
Clustering at the level of husband's occupation or clustering at the level of wife's occupation gives me the same estimate, but very different standard errors.
Since my main interest is a1, should I cluster at husband's occupation? Is that okay to use clustering at husband's occupation?
Or I am thinking about creating a composite category variable based on the occupation of both husband and wife of the year before treatment, following the guide of Nick Cox (https://www.stata.com/statalist/arch.../msg00095.html)
egen both_occ = group(husband's occupation wife's occupation), label
Do you think clustering at "both_occ" is more appropriate than using either husband's occupation or wife's occupation?
Another thought about using group cluster "both_occ" is because if I use either husband's occupation or wife's occupation, total numbers of cluster is below 50. And I read somewhere that the rule of thumb for total number of cluster should be 50.
Thanks in advance.
Comment