GLS vs GMM?

Himanshu Bajaj

Join Date: Sep 2020
Posts: 2

09 Sep 2020, 16:58

I am analysing the impact of non-interest income on bank's risk for EU countries during the period 2015 to 2019. I have an unbalanced panel of 600 banks

The seminal literature used Generalised least squares (GLS) regression while more recent papers have used system GMM using xtabond2. Though no research paper actually specifies what are the instrument variables that they used in their regressions for GMM!

My dependent variable is y9=Z score for each bank i and time t using the formula ((ROA_it+(E/A)_it/abs(ROA_it - average(ROA_t)). I have taken a log of this. This would allow me to include a lag of this variable in my regression. My independent variables are x1=non-interest income/operating revenues, x2=loans/assets, x3=deposits/assets, x4= equity/assets, x6=log(assets), x7=growth rate of assets

I have tried using both GLS and GMM regression. Both regressions provide significantly different results. Which one should I go for?
Per the GLS every independent variable apart from x7 is significant. While per GMM, only x4 and x6 are significant. I am confused!

I run GMM regression using the command below (output also presented). The IV variables are GDP and Inflation for each country, x6=log(assets), x7=growth rate of assets. Have i chosen the right IV variables? I meet the requirements for AR(1) which is significant and AR(2) which is not significant. The Hansen test is not significant as well.

xtabond2 y9_Z L.y9_Z x1_NII x2_LOANS x4_EQUITY x3_DEPOSITS x6_LNTA x7_GTA, gmm(L.y9_Z x1_NII x2_LOANS x4_EQUITY x3_DEPOSITS x2_LOANS) iv(x7_GTA x6_LNTA GDP Inflation) robust small orth

Code:

 Dynamic panel-data estimation, one-step system GMM

Group variable: S_Number Number of obs = 1994

Time variable : Year Number of groups = 562

Number of instruments = 66 Obs per group: min = 1

F(7, 561) = 1.85 avg = 3.55

Prob &gt; F = 0.075 max = 4

Robust

y9_Z Coef. Std. Err. t P&gt;t [95% Conf. Interval]

y9_Z

L1. .0058148 .0394814 0.15 0.883 -.0717346 .0833641

x1_NII -.4605312 .4601922 -1.00 0.317 -1.364442 .4433791

x2_LOANS .4894449 .868123 0.56 0.573 -1.215724 2.194614

x4_EQUITY 3.75577 2.089344 1.80 0.073 -.3481229 7.859662

x3_DEPOSITS 1.808018 1.344987 1.34 0.179 -.8338077 4.449843

x6_LNTA .1174653 .0465224 2.52 0.012 .026086 .2088446

x7_GTA .2200507 .1720606 1.28 0.201 -.117911 .5580124

_cons .5578179 2.031345 0.27 0.784 -3.432152 4.547788

Instruments for orthogonal deviations equation

Standard

FOD.(x7_w_GTA x6_LNTA GDP Inflation)

GMM-type (missing=0, separate instruments for each period unless collapsed)

L(1/4).(L.y9_Z x1_NII x4_EQUITY x3_DEPOSITS x2_LOANS)

Instruments for levels equation

Standard

x7_w_GTA x6_LNTA GDP Inflation

_cons

GMM-type (missing=0, separate instruments for each period unless collapsed)

D.(L.y9_Z x1_NII x4_EQUITY x3_DEPOSITS x2_LOANS)

Arellano-Bond test for AR(1) in first differences: z = -10.46 Pr &gt; z = 0.000

Arellano-Bond test for AR(2) in first differences: z = -1.39 Pr &gt; z = 0.164

Sargan test of overid. restrictions: chi2(58) = 140.00 Prob &gt; chi2 = 0.000

(Not robust, but not weakened by many instruments.)

Hansen test of overid. restrictions: chi2(58) = 70.96 Prob &gt; chi2 = 0.118

(Robust, but weakened by many instruments.)

Difference-in-Hansen tests of exogeneity of instrument subsets:

GMM instruments for levels

Hansen test excluding group: chi2(39) = 44.95 Prob &gt; chi2 = 0.237

Difference (null H = exogenous): chi2(19) = 26.02 Prob &gt; chi2 = 0.130

iv(x7_GTA x6_LNTA GDP Inflation)

Hansen test excluding group: chi2(54) = 64.85 Prob &gt; chi2 = 0.148

Difference (null H = exogenous): chi2(4) = 6.12 Prob &gt; chi2 = 0.191

I run GLS regression using the command below (output also presented). If i run it without the lag for y9, the results are pretty similar. I checked for heteroskedasticity in the data using hettest, rhs fstat after the OLS regression. The p value was significant.

xtgls y9 l.y9 x1_NII x2_LOANS x4_EQUITY x3_DEPOSITS x6_LNTA x7_GTA, panels(hetero)

Code:

 Cross-sectional time-series FGLS regression

Coefficients: generalized least squares

Panels: heteroskedastic

Correlation: no autocorrelation

Estimated covariances = 562
Number of obs =
1,994

Estimated autocorrelations = 0
Number of groups =
562

Estimated coefficients = 8
Obs per group:

min =
1

avg =
3.548043

max =
4

Wald chi2(7) =
1267.07

Prob &gt; chi2 =
0.0000




y9_w_log_Z Coef. Std. Err. z
P&gt;z [95% Conf.
Interval]




y9_w_log_Z

L1. .5033068 .0162615 30.95
0.000 .4714348
.5351788

x1_w_NII -.2555628 .0717397 -3.56
0.000 -.39617
-.1149557

x2_w_LOANS .3151786 .0788558 4.00
0.000 .1606242
.4697331

x4_w_EQUITY 1.724303 .3035333 5.68
0.000 1.129389
2.319218

x3_w_DEPOSITS .3271387 .157598 2.08
0.038 .0182523
.6360251

x6_LNTA .0530074 .0099545 5.32
0.000 .0334969
.0725179

x7_w_GTA .0916365 .0823851 1.11
0.266 -.0698354
.2531084

_cons .8158792 .2733118 2.99
0.003 .2801979
1.35156

Thank you so much for your help!

Tags: GLS, gmm, panel data, xtabond2, xtgls

Carlo Lazzaro

Join Date: Apr 2014

Posts: 17714
#2

10 Sep 2020, 01:52

Himanshu:
welcome to this forum.
Some comments about your query:
- most of your choice depends on what the literature in your research field recommends, as -xtgls- is for T>N static panel data regression models, whereas -xtabond- is for dynamic ones;
- should you go static, however, you should probably consider -xtreg- instead of -xtgls-, as in your panel dataset N>T.

Kind regards,
Carlo
(Stata 19.0)
Comment
Himanshu Bajaj

Join Date: Sep 2020

Posts: 2
#3

10 Sep 2020, 09:36

Thank you Carlos!

I have tried xtreg, and the results are pretty similar to xtgls. Does it makes sense to include lagged dependent variables in a static model?

I am confused as to why static and dynamic models provide widely different results. How does one decide which results are correct, the static or dynamic ones? It seems to me at this point that one can use any model to show the results one wants!
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17714
#4

10 Sep 2020, 09:58

Himanshu:
static models does not allow the inclusion of the lagged dependent variable among the set of predictors (ie, the regressand), whereas dynamic models do.
As far as your second question is concerned, the issue is not to find out the model that allows you to disseminate "the best" (whathever that may mean) coefficients, but the one that gives a true and fair view of the data generating process you're investigating (the literature in your research field can be a relevant source, in this respect).

Kind regards,
Carlo
(Stata 19.0)
Comment

Announcement

GLS vs GMM?

Comment

Comment

Comment