Dear Statalisters,
My model tries to estimate bank risk using different bank-specific and macroeconomic variables. Given the persistence of risk and endogeneity issues of bank-specific variables, documented in literature, I have used a two-step system GMM estimation. My variables are as follows.
Dependent: StrScore
Endogenous Regressors: NIM, CRAR, ContLiab, CorpLoan, OpExpOpRev, ROA
Strictly Exogenous Regressors: PCR, Size, Pub_Dummy ("0" if Private, "1" if Public), GDPGr, GsecYld, CMR, EPUInd, CPInfl, ExcUSD and Year dummies
I attach below the code and results of one of my many trials and seek your suggestions in clarifying the doubts.
Based on my very minimal understanding of the GMM estimation problem, gathered mostly from Roodman(2009) and various statalist discussions, I have the following questions.
1. Is the method, how I've specified lags under gmm() and iv() options looks correct ?
2. Roodman advises reporting number of instruments used, with number of instruments should be less than number of groups. In general, is the number of instruments being more than the number of groups (42 no. of instruments compared to 39 groups) renders the estimation invalid, if all other tests like AR(2), Hansen p-value for overidentifying restriction and Differenec-in-Hansen tests are of complying ?
3. Is it compulsory to include all strictly exogenous variables under iv() instruments ?
4. In one of the discussions Sebastian Kripfganz states that, lower p-value of AR(2) test is potentially concerning. In my case, how should I interpret the AR(2) p-value of 0.144?
5. Hansen test should be the preferred choice for testing overidentifying restrictions, if system GMM with robust standard errors and two-step estimation is employed and the assumptions of homoskedasticity and absence of serial autocorrelation are relaxed. In that sense, should I consider the hansen test p-value of 0.309 (breaching the upper limit of 0.25) problematic ?
6. Hansen test excluding group for gmm() instruments (in bold) rejects null with p-value of 0.011. How does it impact the estimation process, given my AR(2) is ok?
7. I understand that Difference-in-Hansen test would not be reported for a group of instruments, if excluding those instruments would result in the number of instruments falling below the number of regressors (under-identified model). In one of my attempts, Difference-in-Hansen test is not reported for iv() instruments, while they are reported for gmm() instruments and don't reject the null hypothesis. Should it be a cause of concern?
As an aside, can you explain how STATA counts the number of instruments, e.g. 42 in this case?
Sorry for posting such a lengthy array of questions. Thanks for patient reading. I'd really appreciate any suggestions and clarifications coming my way.
Thanks
pankaj
My model tries to estimate bank risk using different bank-specific and macroeconomic variables. Given the persistence of risk and endogeneity issues of bank-specific variables, documented in literature, I have used a two-step system GMM estimation. My variables are as follows.
Dependent: StrScore
Endogenous Regressors: NIM, CRAR, ContLiab, CorpLoan, OpExpOpRev, ROA
Strictly Exogenous Regressors: PCR, Size, Pub_Dummy ("0" if Private, "1" if Public), GDPGr, GsecYld, CMR, EPUInd, CPInfl, ExcUSD and Year dummies
I attach below the code and results of one of my many trials and seek your suggestions in clarifying the doubts.
Code:
xtabond2 ln_StrsScore L.ln_StrsScore L.Pub_Dummy L.CRAR L.GNPA L.PCR L.NIM L.CorpLoan L.ContLiab L.OpExpOpRev L.Size > L.ROA GDPG GsecYld CMR EPUInd CPInfl ExcUSD, gmmstyle(ln_StrsScore, lag(2 4) collapse) gmmstyle(NIM CRAR ContLiab C > orpLoan OpExpOpRev L.ROA, lag(2 3) collapse) ivstyle(Year2-Year18 L.Pub_Dummy L.PCR L.Size) twostep robust Favoring speed over space. To switch, type or click on mata: mata set matafavor space, perm. Warning: Number of instruments may be large relative to number of observations. Warning: Two-step estimated covariance matrix of moments is singular. Using a generalized inverse to calculate optimal weighting matrix for two-step estimation. Difference-in-Sargan/Hansen statistics may be negative. Dynamic panel-data estimation, two-step system GMM ------------------------------------------------------------------------------ Group variable: BankID Number of obs = 643 Time variable : Year Number of groups = 39 Number of instruments = 42 Obs per group: min = 14 Wald chi2(17) = 2.30e+06 avg = 16.49 Prob > chi2 = 0.000 max = 17 ------------------------------------------------------------------------------ | Corrected ln_StrsScore | Coefficient std. err. z P>|z| [95% conf. interval] -------------+---------------------------------------------------------------- ln_StrsScore | L1. | .641975 .0833203 7.70 0.000 .4786701 .8052798 | Pub_Dummy | L1. | .0499874 .0425268 1.18 0.240 -.0333635 .1333384 | CRAR | L1. | -.1038596 .3221506 -0.32 0.747 -.7352632 .5275441 | GNPA | L1. | .5465262 .6220631 0.88 0.380 -.672695 1.765747 | PCR | L1. | .091487 .0414615 2.21 0.027 .0102239 .17275 | NIM | L1. | -7.446248 2.341253 -3.18 0.001 -12.03502 -2.857477 | CorpLoan | L1. | -.2940318 .2343635 -1.25 0.210 -.7533757 .1653122 | ContLiab | L1. | .0461053 .0245001 1.88 0.060 -.0019139 .0941245 | OpExpOpRev | L1. | 1.400362 .2287579 6.12 0.000 .9520049 1.848719 | Size | L1. | .2713295 .0970137 2.80 0.005 .0811861 .4614729 | ROA | L1. | 8.974446 3.907885 2.30 0.022 1.315132 16.63376 | GDPGr | -.6154063 .3196053 -1.93 0.054 -1.241821 .0110085 GsecYld | 11.43682 1.912719 5.98 0.000 7.687965 15.18568 CMR | -1.021948 .9621678 -1.06 0.288 -2.907762 .863866 EPUInd | -.0001234 .0003283 -0.38 0.707 -.0007669 .0005201 CPInfl | .5629143 .4734788 1.19 0.234 -.365087 1.490916 ExcUSD | -.0003851 .0018006 -0.21 0.831 -.0039141 .003144 _cons | -1.791809 .4441954 -4.03 0.000 -2.662416 -.9212017 ------------------------------------------------------------------------------ Instruments for first differences equation Standard D.(Year2 Year3 Year4 Year5 Year6 Year7 Year8 Year9 Year10 Year11 Year12 Year13 Year14 Year15 Year16 Year17 Year18 L.Pub_Dummy L.PCR L.Size) GMM-type (missing=0, separate instruments for each period unless collapsed) L(2/3).(NIM CRAR ContLiab CorpLoan OpExpOpRev L.ROA) collapsed L(2/4).ln_StrsScore collapsed Instruments for levels equation Standard Year2 Year3 Year4 Year5 Year6 Year7 Year8 Year9 Year10 Year11 Year12 Year13 Year14 Year15 Year16 Year17 Year18 L.Pub_Dummy L.PCR L.Size _cons GMM-type (missing=0, separate instruments for each period unless collapsed) DL.(NIM CRAR ContLiab CorpLoan OpExpOpRev L.ROA) collapsed DL.ln_StrsScore collapsed ------------------------------------------------------------------------------ Arellano-Bond test for AR(1) in first differences: z = -3.33 Pr > z = 0.001 Arellano-Bond test for AR(2) in first differences: z = -1.46 Pr > z = 0.144 ------------------------------------------------------------------------------ Sargan test of overid. restrictions: chi2(24) = 75.30 Prob > chi2 = 0.000 (Not robust, but not weakened by many instruments.) Hansen test of overid. restrictions: chi2(24) = 26.90 Prob > chi2 = 0.309 (Robust, but weakened by many instruments.) Difference-in-Hansen tests of exogeneity of instrument subsets: GMM instruments for levels Hansen test excluding group: chi2(17) = 21.02 Prob > chi2 = 0.226 Difference (null H = exogenous): chi2(7) = 5.88 Prob > chi2 = 0.554 gmm(ln_StrsScore, collapse lag(2 4)) Hansen test excluding group: chi2(20) = 24.28 Prob > chi2 = 0.230 Difference (null H = exogenous): chi2(4) = 2.61 Prob > chi2 = 0.625 gmm(NIM CRAR ContLiab CorpLoan OpExpOpRev L.ROA, collapse lag(2 3)) Hansen test excluding group: chi2(6) = 16.51 Prob > chi2 = 0.011 Difference (null H = exogenous): chi2(18) = 10.38 Prob > chi2 = 0.919 iv(Year2 Year3 Year4 Year5 Year6 Year7 Year8 Year9 Year10 Year11 Year12 Year13 Year14 Year15 Year16 Year17 Year18 L. > Pub_Dummy L.PCR L.Size) Hansen test excluding group: chi2(4) = 4.54 Prob > chi2 = 0.337 Difference (null H = exogenous): chi2(20) = 22.35 Prob > chi2 = 0.322
1. Is the method, how I've specified lags under gmm() and iv() options looks correct ?
2. Roodman advises reporting number of instruments used, with number of instruments should be less than number of groups. In general, is the number of instruments being more than the number of groups (42 no. of instruments compared to 39 groups) renders the estimation invalid, if all other tests like AR(2), Hansen p-value for overidentifying restriction and Differenec-in-Hansen tests are of complying ?
3. Is it compulsory to include all strictly exogenous variables under iv() instruments ?
4. In one of the discussions Sebastian Kripfganz states that, lower p-value of AR(2) test is potentially concerning. In my case, how should I interpret the AR(2) p-value of 0.144?
5. Hansen test should be the preferred choice for testing overidentifying restrictions, if system GMM with robust standard errors and two-step estimation is employed and the assumptions of homoskedasticity and absence of serial autocorrelation are relaxed. In that sense, should I consider the hansen test p-value of 0.309 (breaching the upper limit of 0.25) problematic ?
6. Hansen test excluding group for gmm() instruments (in bold) rejects null with p-value of 0.011. How does it impact the estimation process, given my AR(2) is ok?
7. I understand that Difference-in-Hansen test would not be reported for a group of instruments, if excluding those instruments would result in the number of instruments falling below the number of regressors (under-identified model). In one of my attempts, Difference-in-Hansen test is not reported for iv() instruments, while they are reported for gmm() instruments and don't reject the null hypothesis. Should it be a cause of concern?
As an aside, can you explain how STATA counts the number of instruments, e.g. 42 in this case?
Sorry for posting such a lengthy array of questions. Thanks for patient reading. I'd really appreciate any suggestions and clarifications coming my way.
Thanks
pankaj
Comment