standard error correction with 25 clusters and multiple fixed effects

Mary Kate Batistich

Join Date: Aug 2025

Posts: 2
#1

standard error correction with 25 clusters and multiple fixed effects

25 Aug 2025, 09:50

Hello,

I am using ivreghdfe because I have an instrumental variable and state, age, and year fixed effects. I am seeking to cluster my SEs at the state level, where I have 25 states. I understand this to be an insufficient number of clusters and that I should correct for this using a wild cluster bootstrap adjustment. The postestimation command "boottest" should help with this but is not compatible with ivreghdfe when you have multiple fixed effects. I am wondering if it is sufficient to use ivreg2 and include the fixed effects in the regression as "i.state i.year i.age" and then produce a 95% confidence interval for the SEs using "boottest". Or if there is another approach that is more suitable.

I have noticed that I do not get the same result for the boottest using ivreghdfe vs. ivreg2. Different things I have tried are:

ivreg2 ln_earninc_bm (c17shr_b=frate_vs_b) i.age i.year i.statefip if age>=25 & age<=54 & mainyears==1 [aw=tot_b17], cluster(statefip)
boottest c17shr_b, cluster(statefip)

ivreghdfe ln_earninc_bm (c17shr_b=frate_vs_b) i.age i.year if age>=25 & age<=54 & mainyears==1 [aw=tot_b17], absorb(statefip) cluster(statefip)
boottest c17shr_b, cluster(statefip)

ivreghdfe ln_earninc_bm (c17shr_b=frate_vs_b) i.age i.year i.statefip if age>=25 & age<=54 & mainyears==1 [aw=tot_b17], cluster(statefip)
boottest c17shr_b, cluster(statefip)

Where I get slightly different results each time. Any advice would be greatly appreciated. I am using Stata MP 19.5. Thank you!
Tags: None
Andrew Musau

Join Date: Oct 2014

Posts: 10284
#2

25 Aug 2025, 10:52

boottest, from SSC, does not support ivreghdfe. You could instead use the official xtivreg after xtset with the fixed effects variable that has the largest number of levels. The remaining absorbed variables can then be added as indicator variables.
1 like
Comment
Mary Kate Batistich

Join Date: Aug 2025

Posts: 2
#3

25 Aug 2025, 13:37

Hi Andrew,

Thank you for the quick response! It appears that xtivreg does not support weights which I would like to include in the estimation. Do you have a sense of the advantage of using xtivreg versus ivreg2 where I add the fixed effects as indicators, versus something similar to what you suggest but using ivreghdfe instead (which does allow weights)? In other words, could you tell me why your proposed strategy would be an improvement over any of the 3 I suggested above?

Thanks again,
Mary Kate
Comment
Andrew Musau

Join Date: Oct 2014

Posts: 10284
#4

25 Aug 2025, 15:05

ivreg2 is fine, as it is supported by boottest. As I mentioned in #2, ivreghdfe is not supported, so it is out of the question. The advantage of xtivreg over ivreg2 is that, after xtset, you effectively absorb one of the fixed effects variables, whereas with ivreg2, you must include all fixed effects as indicators. However, ivreg2 has a -partial- option that allows you to partial out RHS variables whose coefficients are not of immediate interest:

The partial(varlist) option requests that the exogenous regressors in varlist are "partialled out" from all the other variables (other regressors and excluded instruments) in the estimation. If the equation includes a constant, it is also automatically partialled out as well. The coefficients corresponding to the regressors in varlist are not calculated. By the Frisch-Waugh-Lovell (FWL) theorem, in IV, two-step GMM and LIML estimation the coefficients for the remaining regressors are the same as those that would be obtained if the variables were not partialled out. (NB: this does not hold for CUE or GMM iterated more than two steps.) The partial option is most useful when using cluster and #clusters < (#exogenous regressors + #excluded instruments). In these circumstances, the covariance matrix of orthogonality conditions S is not of full rank, and efficient GMM and overidentification tests are infeasible since the optimal weighting matrix W = S^-1 cannot be calculated. The problem can be addressed by using partial to partial out enough exogenous regressors for S to have full rank. A similar problem arises when the regressors include a variable that is a singleton dummy, i.e., a variable with one 1 and N-1 zeros or vice versa, if a robust covariance matrix is requested. The singleton dummy causes the robust covariance matrix estimator to be less than full rank. In this case, partialling-out the variable with the singleton dummy solves the problem. Specifying partial(_cons) will cause just the constant to be partialled-out, i.e., the equation will be estimated in deviations-from-means form. When ivreg2 is invoked with partial, it reports test statistics with the same small-sample adjustments as if estimating without partial, with the exception of the information in the output header (the model F, R-sqs and total sums-of-squares refer to the model after the variables are partialled-out). Note that after estimation using the partial option, the post-estimation predict can be used only to generate residuals.

I don't think this affects how boottest works, but you can experiment by running boottest with some variables included as indicators versus the same variables partialled out. Be sure to set the bootstrap seed so that you can compare results consistently.

Last edited by Andrew Musau; 25 Aug 2025, 15:11.
1 like
Comment

Announcement

standard error correction with 25 clusters and multiple fixed effects

Comment

Comment

Comment