Cluster at state level

Kushneel Prakash

Join Date: Oct 2018

Posts: 13
#1

Cluster at state level

31 Oct 2018, 23:31

Hello Statalist users,

I have been a long follower of this forum but first time posting a question.

I am currently working on a research paper where I am using individual level data in a survey dataset. I have just few clarifications to seek as I continue of econometrics using STATA.

1. I am using panel data FE. I have done: xtset individualid year. The data is collected annually for 7 states and then I combine it with annual average fuel price per state. I then run POLS, Panel- FE and then do the same using instrument variable on my dependent variable. In my Panel FE, I take (, fe vce(cluster individualid). The results of POLS and Panel FE are insignificant but in correction direction but when it is instrumented for, it becomes significant and as I expect. My question; is there anything I should worry about in terms of the significance of my results in the 4 specifications.

2. If I also want to do Standard errors at state level, how can I do that. I tried it shows that data in not nested in clusters and thus cannot go through. This could be because some individuals would have changed states over the survey waves. Any ideas to proceed.

Thanks.
Tags: None
Phil Bromiley

Join Date: Apr 2014

Posts: 4348
#2

02 Nov 2018, 11:36

You'll increase your chances of a useful answer by following the FAQ on asking questions - provide Stata code in code delimiters, readable Stata output, and sample data using dataex. For example, you refer to POLS, but don't tell us what command you are actually estimating.

If you are instrumenting, you should look at xtivreg and xtivreg2 - if you just shove in predicted values you get the wrong standard errors. xtivreg2 has more diagnostics than xtivreg.

With multiple observations per person, I would normally cluster at the individual level. There is an alternative argument for clustering at the state level since robust standard errors are robust to problems within the clusters. You might look at reghdfe which allows more complex patterns of panel and xtgls which allows cross-panel associations. I don't know for sure, but I suspect reghfde allows for endogeneity.

As for folks moving state, I'd see how many there are. If this is very rare, I'd be tempted to omit such folks (I know this is not strictly correct, but if you have 10000 individuals and 2 that switch states, I'd drop the 2.)
1 like
Comment
David Benson

Join Date: Oct 2018

Posts: 489
#3

03 Nov 2018, 19:22

I agree with Phil, I would tend to cluster on the individual and inspect how many individuals change states over the time period. If they are a very small fraction (I would also be tempted to omit them from the sample, unless they are a key part of your hypotheses (i..e. "individuals who moved in response to an incentive were among the most productive to begin with, and became even more productive once they relocated to State B...").

reghdfe (SSC) allows for clustering at two levels, see http://scorreia.com/software/reghdfe/quickstart.html

You might also check out Doug Miller's Stata code page at http://faculty.econ.ucdavis.edu/faculty/dlmiller/statafiles/

ivreg2 (SSC) allows for clustering on two variables. I don't know if xtivreg2 does.

Finally, this Statalist entry from 2010 (when it was an email list rather than this forum) here, suggested that the person do egen c = group(var1 var2) and then cluster on c.
Comment
Kushneel Prakash

Join Date: Oct 2018

Posts: 13
#4

05 Nov 2018, 05:31

Thank you so much for these helpful comments. I will have a through look at these.
Comment

Announcement

Cluster at state level

Comment

Comment

Comment