Two-Step System-GMM vs simple Fixed Effects Regression

Darian Mistoha

Join Date: Aug 2024
Posts: 9

Two-Step System-GMM vs simple Fixed Effects Regression

19 Aug 2024, 12:06

Hello together,

I am currently working on my thesis and was wondering if my current use of a Two-Step System-GMM is useful at all or if a plain-vanilla FE Regression will do the job.

In Roodman (2009) it is often mentioned that "xtabond2" should be used for small T and large N datasets. I am wondering when a paneldata sets is considered to have a too large T and too small N?

My current datasets consists of over 170.000 observation from 18.000 companies over 30 years. As one can see, its an (heavily) unbalanced dataset.

You can see my results attached if this is of any use.

Code:

Dynamic panel-data estimation, two-step system GMM
------------------------------------------------------------------------------
Group variable: gvkey                           Number of obs      =    111060
Time variable : year                            Number of groups   =     18281
Number of instruments = 448                     Obs per group: min =         1
F(14, 18280)  =  14561.00                                      avg =      6.08
Prob > F      =     0.000                                      max =        29
-------------------------------------------------------------------------------------
                    |              Corrected
                COE | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
--------------------+----------------------------------------------------------------
                COE |
                L1. |   .0746159   .0103966     7.18   0.000     .0542377    .0949942
                    |
         numest_log |   .0048291   .0004184    11.54   0.000     .0040091    .0056492
       eps_var_log2 |   .0088655   .0003942    22.49   0.000     .0080929    .0096382
            log_bmr |   .0166525   .0004454    37.39   0.000     .0157795    .0175254
             mv_log |  -.0102536   .0002588   -39.62   0.000     -.010761   -.0097463
               BETA |   .0036052   .0002894    12.46   0.000      .003038    .0041724
    financial_dummy |   .0051529   .0009553     5.39   0.000     .0032805    .0070253
       health_dummy |  -.0046829   .0009418    -4.97   0.000    -.0065289   -.0028369
   industrial_dummy |   .0024668   .0008792     2.81   0.005     .0007435      .00419
       it_tel_dummy |  -.0008938   .0009191    -0.97   0.331    -.0026952    .0009077
      oil_gas_dummy |   .0156918   .0018826     8.34   0.000     .0120016    .0193819
    materials_dummy |   .0122339   .0012172    10.05   0.000      .009848    .0146197
communication_dummy |   .0034575   .0014513     2.38   0.017     .0006128    .0063021
      utility_dummy |   .0033903   .0014856     2.28   0.022     .0004784    .0063023
              _cons |   .1972499   .0026227    75.21   0.000     .1921092    .2023907
-------------------------------------------------------------------------------------
Instruments for orthogonal deviations equation
  GMM-type (missing=0, separate instruments for each period unless collapsed)
    L(1/29).L.COE
Instruments for levels equation
  Standard
    numest_log eps_var_log2 log_bmr mv_log BETA financial_dummy health_dummy
    industrial_dummy it_tel_dummy oil_gas_dummy materials_dummy
    communication_dummy utility_dummy
    _cons
  GMM-type (missing=0, separate instruments for each period unless collapsed)
    D.L.COE
------------------------------------------------------------------------------
Arellano-Bond test for AR(1) in first differences: z = -24.06  Pr > z =  0.000
Arellano-Bond test for AR(2) in first differences: z =   0.58  Pr > z =  0.565
------------------------------------------------------------------------------
Sargan test of overid. restrictions: chi2(433)  =3464.68  Prob > chi2 =  0.000
  (Not robust, but not weakened by many instruments.)
Hansen test of overid. restrictions: chi2(433)  =1785.01  Prob > chi2 =  0.000
  (Robust, but weakened by many instruments.)

Could you give me an indication if a Two-Step GMM is of any use in this setting or whether a plain FE regression can also do the job?

Have a nice day! :-)

Last edited by Darian Mistoha; 19 Aug 2024, 12:10.

Tags: None

Sebastian Kripfganz

Join Date: May 2014

Posts: 2601
#2

20 Aug 2024, 08:30

On average, you have T=6. This is definitely small, especially relative to N=18281. There is no fixed threshold for T being considered small or large.

The plain FE estimator is biased and inconsistent in this case because of the dynamic nature of the model (lagged dependent variable => Nickell bias). You might find the following presentation useful:
Kripfganz, S. (2019). Generalized method of moments estimation of linear dynamic panel data models. Proceedings of the 2019 London Stata Conference.

https://www.kripfganz.de/stata/
1 like
Comment
Darian Mistoha

Join Date: Aug 2024

Posts: 9
#3

20 Aug 2024, 08:58

Originally posted by Sebastian Kripfganz View Post

On average, you have T=6. This is definitely small, especially relative to N=18281. There is no fixed threshold for T being considered small or large.

The plain FE estimator is biased and inconsistent in this case because of the dynamic nature of the model (lagged dependent variable => Nickell bias). You might find the following presentation useful:
Kripfganz, S. (2019). Generalized method of moments estimation of linear dynamic panel data models. Proceedings of the 2019 London Stata Conference.

Hello Mr Kripfganz,

first of all, thank you very much for your answer and presentation.

I hope it is okay to ask you a follow-up question:

It seems like my AR2 value is not significant, so that second order serial correlation is not a major problem in my model, which seems very good. However, it is mentioned in Roodman (2009) that the p-values in the Hansen-Test should be around 0.05 to 0.25, thus making my p-values (0.000) a little "too good to be true". Thus, I am a afraid, that my model might be flawed. I already saw you answering a different question, suggesting that one could instead use a Difference-GMM model.
https://www.statalist.org/forums/for...n-hansen-tests

Unfortunately, my models still "suffers" from p-values at this level when applying a Difference-GMM model. Could you maybe give me a suggestions on how to deal with this sort of problem? I am just not sure if going on with these p-values is a good approach.
Comment
Andreas Backhaus

Join Date: Mar 2023

Posts: 21
#4

20 Aug 2024, 09:29

The null of the Hansen test is that the overidentifying restrictions are valid, hence your p-value of 0.000 is bad news in this regard. Roodman's point was that this p-value should be far above the conventional level for significance, as you want to be statistically confident to not reject the null of validity. Then again, Roodman points out that you can work your way up to a p-value of 1.000 by adding more instruments.
Try cutting down the number of lags used as instruments and see what happens.
1 like
Comment
Sebastian Kripfganz

Join Date: May 2014

Posts: 2601
#5

21 Aug 2024, 03:53

You have not shown us your command line, which makes it a bit difficult to give a helpful answer. In addition to Andreas' good comment, I notice that you seem to have specified most instruments as standard instruments for the level model. This requires that all those instruments are uncorrelated with the unobserved group-specific effects, which is akin to a "random-effects" assumption and often hard to justify.

https://www.kripfganz.de/stata/
1 like
Comment
Darian Mistoha

Join Date: Aug 2024

Posts: 9
#6

03 Sep 2024, 03:37

Originally posted by Andreas Backhaus View Post

The null of the Hansen test is that the overidentifying restrictions are valid, hence your p-value of 0.000 is bad news in this regard. Roodman's point was that this p-value should be far above the conventional level for significance, as you want to be statistically confident to not reject the null of validity. Then again, Roodman points out that you can work your way up to a p-value of 1.000 by adding more instruments.
Try cutting down the number of lags used as instruments and see what happens.

Thank you Andreas and Sebastian for your really helpful comments! Cutting down the number of lags acutally did the job if anyone here runs into a similar issue. :-)
Comment

Announcement

Two-Step System-GMM vs simple Fixed Effects Regression

Comment

Comment

Comment

Comment

Comment