Hi,
My question is about using fixed effects and clustered standard errors in a difference-in-difference setup.
I am trying to estimate the causal effect of a policy change on a measure of employment over 8 year_quarter periods (2005 divided into 4 quarters are the pre-period; 2010 divided into 4 quarters are the post-period)
The policy change was announced in one state around a specific agricultural commodity. I'm trying to compare this 'treatment' state to 5 other 'control' states that did not have this policy but important cultivators of this agricultural commodity.
I draw my sample from a multistage stratified random sample/pooled cross-section.
My data is at three-levels: 1. Individual (60000 observations)
2. County (1488 observations; approx 180 observations per year_quarter)
3. State (6 states)
I'm trying to estimate a basic diff-in-diff model using year_quarter fixed effects, state or county fixed effects and clustered standard errors.
My questions is:
1. What is more appropriate to use here state fixed effects or county fixed effects? Would a state_county fixed effect be appropriate? I am keen on using counties because the pattern of cultivation of the commodity varies between counties within each state. I have the intuition that state_county fixed effects should be appropriate but if the 'treatment' is assigned at the state level, would this still be appropriate?
2. Relatedly, in the sources that I have referenced so far, the broad consensus has been that standard errors should be clustered at the level at which the treatment has been assigned. In my case, this would entail clustering at the state level. However, if I use state_county fixed effects, do I also need to cluster my standard errors at the county level?
3. Would any of the decisions in 1 and 2 have implications for the kinds of controls I pick (county vs state level)? Would I need to include an interaction term between the post-treatment dummy and the controls?
Thank you!
Kris.
My question is about using fixed effects and clustered standard errors in a difference-in-difference setup.
I am trying to estimate the causal effect of a policy change on a measure of employment over 8 year_quarter periods (2005 divided into 4 quarters are the pre-period; 2010 divided into 4 quarters are the post-period)
The policy change was announced in one state around a specific agricultural commodity. I'm trying to compare this 'treatment' state to 5 other 'control' states that did not have this policy but important cultivators of this agricultural commodity.
I draw my sample from a multistage stratified random sample/pooled cross-section.
My data is at three-levels: 1. Individual (60000 observations)
2. County (1488 observations; approx 180 observations per year_quarter)
3. State (6 states)
I'm trying to estimate a basic diff-in-diff model using year_quarter fixed effects, state or county fixed effects and clustered standard errors.
My questions is:
1. What is more appropriate to use here state fixed effects or county fixed effects? Would a state_county fixed effect be appropriate? I am keen on using counties because the pattern of cultivation of the commodity varies between counties within each state. I have the intuition that state_county fixed effects should be appropriate but if the 'treatment' is assigned at the state level, would this still be appropriate?
2. Relatedly, in the sources that I have referenced so far, the broad consensus has been that standard errors should be clustered at the level at which the treatment has been assigned. In my case, this would entail clustering at the state level. However, if I use state_county fixed effects, do I also need to cluster my standard errors at the county level?
3. Would any of the decisions in 1 and 2 have implications for the kinds of controls I pick (county vs state level)? Would I need to include an interaction term between the post-treatment dummy and the controls?
Thank you!
Kris.
