Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Clustering Standard Errors Over a Small Number of Clusters

    Hi all,

    I am using the yearly American Community Survey files to estimate the impact of a policy on wages in five states (3 treatment, 2 counterfactual). In each of these states, the observations are given at the individual level, and the number of members in each state is large, at approximately 70,000 or more. I am using a difference-in-differences model such that:

    Log_Wages = Policy X Year Individual_Level_Covariates State_Level_Covariates State_Fixed_Effects Time_Fixed_Effects

    I need to account for the clustered nature of the data, but understand that using cluster-robust standard errors (the cluster command, in Stata) with only 5 groups will bias my standard errors downward. I was pointed to the following Donald and Lang paper as an alternative clustering method that accounts for small numbers of groups G, with high numbers of group members n:

    Stephen G. Donald & Kevin Lang, 2007. "Inference with Difference-in-Differences and Other Panel Data," The Review of Economics and Statistics, MIT Press, vol. 89(2), pages 221-233, May.

    I've read the paper multiple times, but am not sure I understand what their method is. I believe their 2-step process in consists of first deriving the difference-in-differences (DID) estimator (without clustering), then regressing the DID estimator on a dummy variable for the policy implementation year, using OLS and a t-distribution for a parameter estimate. Is this correct? I would appreciate any clarification, or an applied example of how this method of clustering might be implemented step by step. I'm also happy to provide more detail on the data I'm currently working on, if that can be helpful.

    Thank you!

    Claire Cahen

  • #2
    You'll increase your chances of a useful answer by following the FAQ on asking questions - provide Stata code in code delimiters, readable Stata output, and sample data using dataex.

    You're asking us to read a paper and figure it out for you. That only happens if someone happens to be interested in that specific paper. Alternatively, is there a significant state effect? If not you might consider ols with clustered standard errors. Otherwise, take one of the data sets the paper estimates and mess with it until you replicate their results. The description on page 229 seems clear.

    Comment

    Working...
    X