Dear All,
I am running an instrumental variable regression on a small sample size of 83. My KP F-stat is 13. I am performing the weak IV test using the 'weakiv' package, but I keep encountering an unsupported estimator error when clustering the standard errors. When using the first option of ivreg2, I get the Anderson-Rubin Wald test which shows that construction2009 affects lannual_avg_no2, even if my instrument (lHubDist) might be weak. When I use robust standard errors, the weak IV command works.
In the paper "A Practical Guide to Weak Instruments" by Keane and Neal, the authors advocate for the use of cluster-robust F statistics when dealing with panel data and the use of heteroskedasticity-robust standard errors when dealing with cross-sectional data. I have cross-sectional data, but I am clustering the data at the district level.
My intuition is that the first option automatically provides the AR test value, and there is no need to use the weak IV command.
I have attached all the output I am receiving. Additionally, I have used the Montiel-Pflueger robust weak instrument test. This test shows that my IV estimates might be biased between 20%-30%. Any help and guidance would be greatly appreciated.
I am running an instrumental variable regression on a small sample size of 83. My KP F-stat is 13. I am performing the weak IV test using the 'weakiv' package, but I keep encountering an unsupported estimator error when clustering the standard errors. When using the first option of ivreg2, I get the Anderson-Rubin Wald test which shows that construction2009 affects lannual_avg_no2, even if my instrument (lHubDist) might be weak. When I use robust standard errors, the weak IV command works.
In the paper "A Practical Guide to Weak Instruments" by Keane and Neal, the authors advocate for the use of cluster-robust F statistics when dealing with panel data and the use of heteroskedasticity-robust standard errors when dealing with cross-sectional data. I have cross-sectional data, but I am clustering the data at the district level.
My intuition is that the first option automatically provides the AR test value, and there is no need to use the weak IV command.
I have attached all the output I am receiving. Additionally, I have used the Montiel-Pflueger robust weak instrument test. This test shows that my IV estimates might be biased between 20%-30%. Any help and guidance would be greatly appreciated.
Code:
ivreg2 lannual_avg_no2 (construction2009 = lHubDist) manufacturing2009 tfp lelevation_mean lvcf_mean2013 ldmsp_mean_light2013 lyear2013 lprec2013 ltmean2013 i.state_id2, cl(district_id) first First-stage regressions ----------------------- First-stage regression of construction2009: Statistics robust to heteroskedasticity and clustering on district_id Number of obs = 83 Number of clusters (district_id) = 71 -------------------------------------------------------------------------------------- | Robust construction2009 | Coefficient std. err. t P>|t| [95% conf. interval] ---------------------+---------------------------------------------------------------- lHubDist | -.0184343 .0050494 -3.65 0.001 -.0285346 -.008334 manufacturing2009 | -.0080146 .0687827 -0.12 0.908 -.1456005 .1295713 tfp | -.0248488 .0107343 -2.31 0.024 -.0463207 -.003377 lelevation_mean | .0130909 .0094243 1.39 0.170 -.0057606 .0319423 lvcf_mean2013 | -.0117367 .0127732 -0.92 0.362 -.0372868 .0138135 ldmsp_mean_light2013 | .0085817 .0107857 0.80 0.429 -.0129928 .0301563 lyear2013 | -.0135505 .0056428 -2.40 0.019 -.0248377 -.0022633 lprec2013 | .0033559 .0148407 0.23 0.822 -.0263299 .0330417 ltmean2013 | .194862 .2849136 0.68 0.497 -.37505 .764774 | state_id2 | 07 | -.0977399 .0344553 -2.84 0.006 -.1666608 -.028819 08 | -.0197922 .0367808 -0.54 0.592 -.0933648 .0537804 09 | -.0071806 .0349712 -0.21 0.838 -.0771334 .0627722 10 | .0352983 .0451267 0.78 0.437 -.0549685 .1255652 19 | -.0996996 .0405225 -2.46 0.017 -.1807566 -.0186426 21 | -.0852802 .0509004 -1.68 0.099 -.1870961 .0165357 23 | -.0398631 .0561977 -0.71 0.481 -.1522752 .0725491 24 | -.1207808 .0476104 -2.54 0.014 -.2160157 -.0255459 27 | -.0969799 .0440684 -2.20 0.032 -.1851297 -.00883 28 | -.1074932 .053704 -2.00 0.050 -.2149171 -.0000692 29 | -.0927427 .0385901 -2.40 0.019 -.1699344 -.015551 32 | .0427553 .0485485 0.88 0.382 -.0543561 .1398667 33 | -.0533523 .0569229 -0.94 0.352 -.1672151 .0605105 | _cons | -.3434168 .9726428 -0.35 0.725 -2.288992 1.602159 -------------------------------------------------------------------------------------- F test of excluded instruments: F( 1, 70) = 13.33 Prob > F = 0.0005 Sanderson-Windmeijer multivariate F test of excluded instruments: F( 1, 70) = 13.33 Prob > F = 0.0005 Summary results for first-stage regressions ------------------------------------------- (Underid) (Weak id) Variable | F( 1, 70) P-val | SW Chi-sq( 1) P-val | SW F( 1, 70) construction | 13.33 0.0005 | 18.48 0.0000 | 13.33 NB: first-stage test statistics cluster-robust Stock-Yogo weak ID F test critical values for single endogenous regressor: 10% maximal IV size 16.38 15% maximal IV size 8.96 20% maximal IV size 6.66 25% maximal IV size 5.53 Source: Stock-Yogo (2005). Reproduced by permission. NB: Critical values are for i.i.d. errors only. Underidentification test Ho: matrix of reduced form coefficients has rank=K1-1 (underidentified) Ha: matrix has rank=K1 (identified) Kleibergen-Paap rk LM statistic Chi-sq(1)=11.74 P-val=0.0006 Weak identification test Ho: equation is weakly identified Cragg-Donald Wald F statistic 17.72 Kleibergen-Paap Wald rk F statistic 13.33 Stock-Yogo weak ID test critical values for K1=1 and L1=1: 10% maximal IV size 16.38 15% maximal IV size 8.96 20% maximal IV size 6.66 25% maximal IV size 5.53 Source: Stock-Yogo (2005). Reproduced by permission. NB: Critical values are for Cragg-Donald F statistic and i.i.d. errors. Weak-instrument-robust inference Tests of joint significance of endogenous regressors B1 in main equation Ho: B1=0 and orthogonality conditions are valid Anderson-Rubin Wald test F(1,70)= 13.43 P-val=0.0005 Anderson-Rubin Wald test Chi-sq(1)= 18.61 P-val=0.0000 Stock-Wright LM S statistic Chi-sq(1)= . P-val= . NB: Underidentification, weak identification and weak-identification-robust test statistics cluster-robust Number of clusters N_clust = 71 Number of observations N = 83 Number of regressors K = 23 Number of endogenous regressors K1 = 1 Number of instruments L = 23 Number of excluded instruments L1 = 1 IV (2SLS) estimation -------------------- Estimates efficient for homoskedasticity only Statistics robust to heteroskedasticity and clustering on district_id Number of clusters (district_id) = 71 Number of obs = 83 F( 22, 70) = 241.16 Prob > F = 0.0000 Total (centered) SS = 19.09128377 Centered R2 = 0.6885 Total (uncentered) SS = 810.8011231 Uncentered R2 = 0.9927 Residual SS = 5.947878021 Root MSE = .2677 -------------------------------------------------------------------------------------- | Robust lannual_avg_no2 | Coefficient std. err. z P>|z| [95% conf. interval] ---------------------+---------------------------------------------------------------- construction2009 | 5.682213 1.56244 3.64 0.000 2.619886 8.744541 manufacturing2009 | .0545445 .434132 0.13 0.900 -.7963386 .9054277 tfp | .197584 .0646678 3.06 0.002 .0708374 .3243305 lelevation_mean | -.042575 .0633236 -0.67 0.501 -.1666869 .081537 lvcf_mean2013 | .1089546 .0650508 1.67 0.094 -.0185426 .2364517 ldmsp_mean_light2013 | .1503779 .0623716 2.41 0.016 .0281318 .2726241 lyear2013 | .0840521 .0487457 1.72 0.085 -.0114878 .179592 lprec2013 | -.1540445 .1267045 -1.22 0.224 -.4023807 .0942917 ltmean2013 | -2.112641 2.35731 -0.90 0.370 -6.732883 2.507601 | state_id2 | 07 | .8731768 .3254068 2.68 0.007 .2353911 1.510963 08 | .3533654 .3874195 0.91 0.362 -.4059629 1.112694 09 | .3033463 .3266158 0.93 0.353 -.3368089 .9435014 10 | .495316 .4479777 1.11 0.269 -.3827041 1.373336 19 | 1.245583 .3559856 3.50 0.000 .5478636 1.943302 21 | .4229356 .4275996 0.99 0.323 -.4151443 1.261015 23 | .3273937 .4020913 0.81 0.416 -.4606908 1.115478 24 | .5723936 .4079419 1.40 0.161 -.2271578 1.371945 27 | .8046118 .376948 2.13 0.033 .0658072 1.543416 28 | .3735878 .4637364 0.81 0.420 -.5353188 1.282494 29 | -.1430758 .3879087 -0.37 0.712 -.903363 .6172113 32 | -.9978864 .4648677 -2.15 0.032 -1.90901 -.0867624 33 | .3270571 .4880919 0.67 0.503 -.6295854 1.2837 | _cons | 8.155018 7.66243 1.06 0.287 -6.863069 23.1731 -------------------------------------------------------------------------------------- Underidentification test (Kleibergen-Paap rk LM statistic): 11.741 Chi-sq(1) P-val = 0.0006 ------------------------------------------------------------------------------ Weak identification test (Cragg-Donald Wald F statistic): 17.724 (Kleibergen-Paap rk Wald F statistic): 13.328 Stock-Yogo weak ID test critical values: 10% maximal IV size 16.38 15% maximal IV size 8.96 20% maximal IV size 6.66 25% maximal IV size 5.53 Source: Stock-Yogo (2005). Reproduced by permission. NB: Critical values are for Cragg-Donald F statistic and i.i.d. errors. ------------------------------------------------------------------------------ Warning: estimated covariance matrix of moment conditions not of full rank. overidentification statistic not reported, and standard errors and model tests should be interpreted with caution. Possible causes: number of clusters insufficient to calculate robust covariance matrix singleton dummy variable (dummy with one 1 and N-1 0s or vice versa) partial option may address problem. ------------------------------------------------------------------------------ Instrumented: construction2009 Included instruments: manufacturing2009 tfp lelevation_mean lvcf_mean2013 ldmsp_mean_light2013 lyear2013 lprec2013 ltmean2013 3.state_id2 4.state_id2 5.state_id2 6.state_id2 7.state_id2 8.state_id2 9.state_id2 10.state_id2 11.state_id2 12.state_id2 13.state_id2 14.state_id2 15.state_id2 Excluded instruments: lHubDist ------------------------------------------------------------------------------ weakiv ivreg2 lannual_avg_no2 (construction2009 = lHubDist) manufacturing2009 tfp lelevation_mean lvcf_mean2013 ldmsp_mean_light2013 lyear2013 lprec2013 ltmean2013 i.state_id2, cl(district_id) Estimating model for Wald tests using ivreg2... type mismatch # Using robust option ivreg2 lannual_avg_no2 (construction2009 = lHubDist) manufacturing2009 tfp lelevation_mean lvcf_mean2013 ldmsp_mean_light2013 lyear2013 lprec2013 ltmean2013 i.state_id2, robust first First-stage regressions ----------------------- First-stage regression of construction2009: Statistics robust to heteroskedasticity Number of obs = 83 -------------------------------------------------------------------------------------- | Robust construction2009 | Coefficient std. err. t P>|t| [95% conf. interval] ---------------------+---------------------------------------------------------------- lHubDist | -.0184343 .0047659 -3.87 0.000 -.0279676 -.0089011 manufacturing2009 | -.0080146 .0681988 -0.12 0.907 -.1444325 .1284033 tfp | -.0248488 .0106259 -2.34 0.023 -.0461038 -.0035939 lelevation_mean | .0130909 .0090265 1.45 0.152 -.0049649 .0311466 lvcf_mean2013 | -.0117367 .0120301 -0.98 0.333 -.0358004 .0123271 ldmsp_mean_light2013 | .0085817 .0101256 0.85 0.400 -.0116724 .0288359 lyear2013 | -.0135505 .0055968 -2.42 0.019 -.0247457 -.0023553 lprec2013 | .0033559 .0144947 0.23 0.818 -.0256378 .0323496 ltmean2013 | .194862 .2837048 0.69 0.495 -.3726321 .7623561 | state_id2 | 07 | -.0977399 .0327464 -2.98 0.004 -.1632425 -.0322373 08 | -.0197922 .0360276 -0.55 0.585 -.091858 .0522737 09 | -.0071806 .0343292 -0.21 0.835 -.0758493 .0614881 10 | .0352983 .0444134 0.79 0.430 -.0535416 .1241383 19 | -.0996996 .039216 -2.54 0.014 -.1781433 -.0212559 21 | -.0852802 .0499134 -1.71 0.093 -.1851218 .0145614 23 | -.0398631 .0558556 -0.71 0.478 -.1515909 .0718648 24 | -.1207808 .0467 -2.59 0.012 -.2141948 -.0273669 27 | -.0969799 .0435774 -2.23 0.030 -.1841477 -.009812 28 | -.1074932 .052825 -2.03 0.046 -.2131588 -.0018275 29 | -.0927427 .0378263 -2.45 0.017 -.1684066 -.0170789 32 | .0427553 .0480659 0.89 0.377 -.0533909 .1389015 33 | -.0533523 .0562508 -0.95 0.347 -.1658707 .059166 | _cons | -.3434168 .9686871 -0.35 0.724 -2.28108 1.594246 -------------------------------------------------------------------------------------- F test of excluded instruments: F( 1, 60) = 14.96 Prob > F = 0.0003 Sanderson-Windmeijer multivariate F test of excluded instruments: F( 1, 60) = 14.96 Prob > F = 0.0003 Summary results for first-stage regressions ------------------------------------------- (Underid) (Weak id) Variable | F( 1, 60) P-val | SW Chi-sq( 1) P-val | SW F( 1, 60) construction | 14.96 0.0003 | 20.70 0.0000 | 14.96 NB: first-stage test statistics heteroskedasticity-robust Stock-Yogo weak ID F test critical values for single endogenous regressor: 10% maximal IV size 16.38 15% maximal IV size 8.96 20% maximal IV size 6.66 25% maximal IV size 5.53 Source: Stock-Yogo (2005). Reproduced by permission. NB: Critical values are for i.i.d. errors only. Underidentification test Ho: matrix of reduced form coefficients has rank=K1-1 (underidentified) Ha: matrix has rank=K1 (identified) Kleibergen-Paap rk LM statistic Chi-sq(1)=14.94 P-val=0.0001 Weak identification test Ho: equation is weakly identified Cragg-Donald Wald F statistic 17.72 Kleibergen-Paap Wald rk F statistic 14.96 Stock-Yogo weak ID test critical values for K1=1 and L1=1: 10% maximal IV size 16.38 15% maximal IV size 8.96 20% maximal IV size 6.66 25% maximal IV size 5.53 Source: Stock-Yogo (2005). Reproduced by permission. NB: Critical values are for Cragg-Donald F statistic and i.i.d. errors. Weak-instrument-robust inference Tests of joint significance of endogenous regressors B1 in main equation Ho: B1=0 and orthogonality conditions are valid Anderson-Rubin Wald test F(1,60)= 15.63 P-val=0.0002 Anderson-Rubin Wald test Chi-sq(1)= 21.62 P-val=0.0000 Stock-Wright LM S statistic Chi-sq(1)= . P-val= . NB: Underidentification, weak identification and weak-identification-robust test statistics heteroskedasticity-robust Number of observations N = 83 Number of regressors K = 23 Number of endogenous regressors K1 = 1 Number of instruments L = 23 Number of excluded instruments L1 = 1 IV (2SLS) estimation -------------------- Estimates efficient for homoskedasticity only Statistics robust to heteroskedasticity Number of obs = 83 F( 22, 60) = 245.03 Prob > F = 0.0000 Total (centered) SS = 19.09128377 Centered R2 = 0.6885 Total (uncentered) SS = 810.8011231 Uncentered R2 = 0.9927 Residual SS = 5.947878021 Root MSE = .2677 -------------------------------------------------------------------------------------- | Robust lannual_avg_no2 | Coefficient std. err. z P>|z| [95% conf. interval] ---------------------+---------------------------------------------------------------- construction2009 | 5.682213 1.533766 3.70 0.000 2.676088 8.688339 manufacturing2009 | .0545445 .4264853 0.13 0.898 -.7813513 .8904404 tfp | .197584 .0626455 3.15 0.002 .0748011 .3203668 lelevation_mean | -.042575 .0618594 -0.69 0.491 -.1638171 .0786672 lvcf_mean2013 | .1089546 .061616 1.77 0.077 -.0118106 .2297197 ldmsp_mean_light2013 | .1503779 .0594219 2.53 0.011 .0339131 .2668428 lyear2013 | .0840521 .0481032 1.75 0.081 -.0102285 .1783327 lprec2013 | -.1540445 .1207354 -1.28 0.202 -.3906815 .0825925 ltmean2013 | -2.112641 2.343997 -0.90 0.367 -6.70679 2.481508 | state_id2 | 07 | .8731768 .3240291 2.69 0.007 .2380915 1.508262 08 | .3533654 .3832305 0.92 0.356 -.3977527 1.104483 09 | .3033463 .3242123 0.94 0.349 -.3320981 .9387906 10 | .495316 .4407528 1.12 0.261 -.3685436 1.359176 19 | 1.245583 .347189 3.59 0.000 .5651047 1.92606 21 | .4229356 .4193161 1.01 0.313 -.3989089 1.24478 23 | .3273937 .3991721 0.82 0.412 -.4549693 1.109757 24 | .5723936 .4013948 1.43 0.154 -.2143258 1.359113 27 | .8046118 .3749752 2.15 0.032 .069674 1.53955 28 | .3735878 .4596281 0.81 0.416 -.5272667 1.274442 29 | -.1430758 .3859589 -0.37 0.711 -.8995414 .6133898 32 | -.9978864 .4594633 -2.17 0.030 -1.898418 -.0973549 33 | .3270571 .4847485 0.67 0.500 -.6230325 1.277147 | _cons | 8.155018 7.626706 1.07 0.285 -6.79305 23.10309 -------------------------------------------------------------------------------------- Underidentification test (Kleibergen-Paap rk LM statistic): 14.940 Chi-sq(1) P-val = 0.0001 ------------------------------------------------------------------------------ Weak identification test (Cragg-Donald Wald F statistic): 17.724 (Kleibergen-Paap rk Wald F statistic): 14.961 Stock-Yogo weak ID test critical values: 10% maximal IV size 16.38 15% maximal IV size 8.96 20% maximal IV size 6.66 25% maximal IV size 5.53 Source: Stock-Yogo (2005). Reproduced by permission. NB: Critical values are for Cragg-Donald F statistic and i.i.d. errors. ------------------------------------------------------------------------------ Warning: estimated covariance matrix of moment conditions not of full rank. overidentification statistic not reported, and standard errors and model tests should be interpreted with caution. Possible causes: singleton dummy variable (dummy with one 1 and N-1 0s or vice versa) partial option may address problem. ------------------------------------------------------------------------------ Instrumented: construction2009 Included instruments: manufacturing2009 tfp lelevation_mean lvcf_mean2013 ldmsp_mean_light2013 lyear2013 lprec2013 ltmean2013 3.state_id2 4.state_id2 5.state_id2 6.state_id2 7.state_id2 8.state_id2 9.state_id2 10.state_id2 11.state_id2 12.state_id2 13.state_id2 14.state_id2 15.state_id2 Excluded instruments: lHubDist ------------------------------------------------------------------------------ weakiv ivreg2 lannual_avg_no2 (construction2009 = lHubDist) manufacturing2009 tfp lelevation_mean lvcf_mean2013 ldmsp_mean_light2013 lyear2013 lprec2013 ltmean2013 i.state_id2, robust first Estimating model for Wald tests using ivreg2... Estimating confidence sets over 100 grid points ----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5 .................................................. 50 .................................................. 100 Weak instrument robust tests and confidence sets for linear IV H0: beta[lannual_avg_no2:construction2009] = 0 ------------------------------------------------------------------------------ Test | Statistic p-value | Conf. level Conf. Set ------+---------------------------------+------------------------------------- AR | chi2(1) = 12.66 0.0004 | 95% [ 3.07083, 10.1155] ------+---------------------------------+------------------------------------- Wald | chi2(1) = 13.73 0.0002 | 95% [ 2.67609, 8.68834] ------------------------------------------------------------------------------ Confidence sets estimated for 100 points in [-.330038, 11.6945]. Number of obs N = 83. Method = lagrange multiplier (LM). Tests robust to heteroskedasticity. Wald statistic in last row is based on ivreg2 estimation and is not robust to weak instruments.