Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Help with Xtbond2 panel data

    Hi, I would like to begin by saying I am new to statalist, so apologies for any issues with the quality of my question. I have previously only used OLS on stata, but for my thesis, I would like to use a GMM two-step, following the literature I have read, which advocated for using lagged explanatory variables as instruments. I am looking at how board size, independence and financial expertise of board members influence performance in the banking industry. I have never used GMM so while I have spent plenty of time researching, I am unsure if my methods or syntax are correct.
    I am using an interaction variable (BSFEper) (the percentage of financial experts on board*boardsize) and also have a variable for the percentage of independent directors that are classified as financial experts (perIDFE). Furthermore, I am looking at a non-linear relation with board size, so I am quite confused about what exactly should be used as IVs.

    So my main questions are:
    1) Is it okay to have the interactions and non-linear effects, and which should be used as instruments?
    2) What are the Issues with my code
    3) Do the reported tests support the instruments used (from my understanding, they do, but I am unsure).
    4) what recommendations would you make to better the analysis

    Also any other issues that could be pointed out would be greatly appreciated. However, it is also a bachelor's Thesis, so while I would like my work to be good as possible, it does not need to be of expert quality.

    Thank you in advance, and once again, apologies if I have not properly followed the posting guidelines.

    Code:
    xi: xtabond2 ROA L.ROA BS BS2 BSFEper BS2FE FEper IDper perIDFE age lnTA year crisis leverage , gmm( L.BS L.BS2 L.IDper L.BSFEper L.perIDFE) iv(i.year, equation(level)) nodiffsargan twostep robust orthogonal small
    Code:
    Dynamic panel-data estimation, two-step system GMM
    ------------------------------------------------------------------------------
    Group variable: gvkey                           Number of obs      =      1516
    Time variable : year                            Number of groups   =       255
    Number of instruments = 848                     Obs per group: min =         1
    F(13, 254)    =   1179.32                                      avg =      5.95
    Prob > F      =     0.000                                      max =        18
    ------------------------------------------------------------------------------
                 |              Corrected
             ROA |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
             ROA |
             L1. |   .6742928   .0634229    10.63   0.000     .5493911    .7991945
                 |
              BS |   .0034152   .0020192     1.69   0.092    -.0005613    .0073917
             BS2 |  -.0001301   .0000772    -1.68   0.093    -.0002822     .000022
         BSFEper |  -.0054987   .0032227    -1.71   0.089    -.0118454     .000848
           BS2FE |   .0001996   .0001237     1.61   0.108     -.000044    .0004432
           FEper |   .0354722   .0195382     1.82   0.071    -.0030052    .0739496
           IDper |   -.004322   .0038887    -1.11   0.267    -.0119802    .0033362
         perIDFE |  -.0008125   .0047536    -0.17   0.864     -.010174     .008549
            lnTA |   .0001546   .0002384     0.65   0.517    -.0003149     .000624
            year |  -.0002598   .0000564    -4.60   0.000    -.0003709   -.0001486
             age |   -.000121   .0000856    -1.41   0.159    -.0002897    .0000477
          crisis |  -.0084961   .0013747    -6.18   0.000    -.0112034   -.0057887
        leverage |   .0409919   .0158303     2.59   0.010     .0098164    .0721673
           _cons |   .5139144   .1133947     4.53   0.000     .2906009    .7372279
    ------------------------------------------------------------------------------
    Instruments for orthogonal deviations equation
      GMM-type (missing=0, separate instruments for each period unless collapsed)
        L(1/20).(L.BS L.BS2 L.IDper L.BSFEper L.perIDFE)
    Instruments for levels equation
      Standard
        _Iyear_2003 _Iyear_2004 _Iyear_2005 _Iyear_2006 _Iyear_2007 _Iyear_2008
        _Iyear_2009 _Iyear_2010 _Iyear_2011 _Iyear_2012 _Iyear_2013 _Iyear_2014
        _Iyear_2015 _Iyear_2016 _Iyear_2017 _Iyear_2018 _Iyear_2019 _Iyear_2020
        _Iyear_2021 _Iyear_2022
        _cons
      GMM-type (missing=0, separate instruments for each period unless collapsed)
        D.(L.BS L.BS2 L.IDper L.BSFEper L.perIDFE)
    ------------------------------------------------------------------------------
    Arellano-Bond test for AR(1) in first differences: z =  -4.02  Pr > z =  0.000
    Arellano-Bond test for AR(2) in first differences: z =  -0.25  Pr > z =  0.805
    ------------------------------------------------------------------------------
    Sargan test of overid. restrictions: chi2(834)  = 697.56  Prob > chi2 =  1.000
      (Not robust, but not weakened by many instruments.)
    Hansen test of overid. restrictions: chi2(834)  = 204.94  Prob > chi2 =  1.000
      (Robust, but weakened by many instruments.)

  • #2
    You can treat the interaction/nonlinear terms the same way as any other variable when instrumenting them.

    There are way too many instruments, which leads to the so-called "too-many-instruments problem". As a consequence, coefficients and standard errors are biased, as are test statistics. For example, the p-value of 1.000 for the Hansen test seemingly suggests that everything is okay, but it actually is just a consquence of overfitting (too many instruments). Use the collapse option and possibly restrict the number of lags used with the lag() suboption of gmm().

    You have used a linear time trend, year, as a regressor but year-specific time dummies, i.year, as instruments. It might be better to be consistent (use either a trend or time dummies both as regressors and instruments).

    There are no explicit instruments for the variables age lnTA crisis leverage. You might want to treat them as exogenous variables and instrument them accordingly to avoid a weak-instruments problem.

    Normally, you would also include lagged instruments for the lagged dependent variable L.ROA in gmm().

    The following presentation slides provide more insights about these and other issues:
    https://www.kripfganz.de/stata/

    Comment


    • #3
      Thank you very much! That is very helpful. I understand it may be situational but is there any general rule of thumb regarding the maximum number of instruments relative to observations/group size?
      In terms of reducing instruments, is there a need to include L.BS2 (lagged board size^2), or would L.BS serve as a valid instrument for both BS and BS2?
      I then have the same question for interaction terms, would just including the individual terms be sufficient?

      Thank you again

      Comment


      • #4
        Any rule of thumb is somewhat arbitrary, but you can find one on slide 93 of my presentation. In any case, the number of instruments should be much smaller than the number of groups.

        Lagged board size could be a sufficiently strong instrument for board size squared; but this is not guaranteed. It really depends on your data. If you manage to reduce the number of instruments by other means, it might be safer to include the lagged square in the set of instruments. The same argument applies to the interaction terms.
        https://www.kripfganz.de/stata/

        Comment


        • #5
          Thank you once again! I have now adjusted my model and code. I know the number of instruments is still very high, but I read that the minimum acceptable standard is that it is less than the number of groups, and as this is for a bachelor thesis, I think that should be acceptable.

          How does it look now? Also, is it just the Hansen test that I should be looking at, as it is a system GMM estimator?



          Code:
          xi: xtabond2 ROA L.ROA BS BS2 BSFEper BS2FE FEper IDper perIDFE IDFEINTT lnTA year age  RiskAdjustedCapitalRatioTi , gmm( L.BS L.IDper L.lnTA L.ROA  L.FEper L.RiskAdjustedCapitalRatioTi , lag(1 0)) iv(year, equation(level)) nodiffsargan twostep robust orthogonal small
          Code:
          Dynamic panel-data estimation, two-step system GMM
          ------------------------------------------------------------------------------
          Group variable: gvkey                           Number of obs      =      2124
          Time variable : year                            Number of groups   =       308
          Number of instruments = 306                     Obs per group: min =         1
          F(13, 307)    =    463.44                                      avg =      6.90
          Prob > F      =     0.000                                      max =        18
          --------------------------------------------------------------------------------------------
                                     |              Corrected
                                 ROA |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
          ---------------------------+----------------------------------------------------------------
                                 ROA |
                                 L1. |   .5913544   .0510288    11.59   0.000      .490944    .6917648
                                     |
                                  BS |   .0053686   .0026697     2.01   0.045     .0001153     .010622
                                 BS2 |   -.000215   .0001037    -2.07   0.039    -.0004191   -.0000109
                             BSFEper |  -.0122738   .0053504    -2.29   0.022    -.0228019   -.0017457
                               BS2FE |   .0005071   .0002167     2.34   0.020     .0000807    .0009336
                               FEper |   .0413963   .0329856     1.25   0.210    -.0235101    .1063028
                               IDper |   -.048096   .0121756    -3.95   0.000    -.0720542   -.0241378
                             perIDFE |  -.0464044   .0187034    -2.48   0.014    -.0832075   -.0096013
                            IDFEINTT |   .0917104   .0242439     3.78   0.000     .0440052    .1394157
                                lnTA |  -.0001426   .0004237    -0.34   0.737    -.0009763     .000691
                                year |   .0000578   .0000599     0.97   0.335      -.00006    .0001756
                                 age |  -.0001404   .0001698    -0.83   0.409    -.0004745    .0001938
          RiskAdjustedCapitalRatioTi |   .0001449   .0001569     0.92   0.356    -.0001638    .0004536
                               _cons |  -.0921911    .117807    -0.78   0.434    -.3240025    .1396202
          --------------------------------------------------------------------------------------------
          Instruments for orthogonal deviations equation
            GMM-type (missing=0, separate instruments for each period unless collapsed)
              L(0/1).(L.BS L.IDper L.lnTA L.ROA L.FEper L.RiskAdjustedCapitalRatioTi)
          Instruments for levels equation
            Standard
              year
              _cons
            GMM-type (missing=0, separate instruments for each period unless collapsed)
              DL.(L.BS L.IDper L.lnTA L.ROA L.FEper L.RiskAdjustedCapitalRatioTi)
          ------------------------------------------------------------------------------
          Arellano-Bond test for AR(1) in first differences: z =  -4.69  Pr > z =  0.000
          Arellano-Bond test for AR(2) in first differences: z =  -1.36  Pr > z =  0.175
          ------------------------------------------------------------------------------
          Sargan test of overid. restrictions: chi2(292)  =1283.18  Prob > chi2 =  0.000
            (Not robust, but not weakened by many instruments.)
          Hansen test of overid. restrictions: chi2(292)  = 283.35  Prob > chi2 =  0.631
            (Robust, but weakened by many instruments.)
          Thank you once again. Your help is very much appreciated!

          Comment


          • #6
            Personally, I think the number of instruments is still way too high. But eventually you would need to come to an agreement with your thesis supervisor on this matter.

            You should definitely report the Hansen test and the Arellano-Bond AR(2) test. Normally, I would recommend to also consider a difference-in-Hansen test, which evaluates the system GMM estimator relative to the difference GMM estimator, and possibly underidentification tests. Details can be found in my presentation slides, which I linked earlier. Whether you need those for your bachelor thesis is again something you should discuss with your supervisor, if in doubt.
            https://www.kripfganz.de/stata/

            Comment


            • #7
              Thank you again. You have been very helpful. I have now used the collapse command, and the results have turned out much better, I think. Including the square term and interaction, terms led to overidentification, but I think the individual variables should serve as strong enough instruments. I have not included lnTA as an instrument following the previous literature I have read on related studies. Is there any other issues you can see here? And am I correct in interpreting the results that there is evidence of the instruments being strong and valid?





              Code:
              xi: xtabond2 ROA L.ROA BS BS2 BSFEper BS2FE FEper IDper perIDFE IDFEINTT lnTA year leverage RiskAdjustedCapitalRatioTi crisis  , gmm( L.(ROA BS FEper IDper perIDFE leverage RiskAdjustedCapitalRatioTi ) , collapse lag(2 0)) iv(year, equation(level)) twostep orthogonal small robust

              Code:
              Dynamic panel-data estimation, two-step system GMM
              ------------------------------------------------------------------------------
              Group variable: gvkey                           Number of obs      =      1497
              Time variable : year                            Number of groups   =       253
              Number of instruments = 30                      Obs per group: min =         1
              F(14, 252)    =     24.96                                      avg =      5.92
              Prob > F      =     0.000                                      max =        18
              ------------------------------------------------------------------------------
                           |              Corrected
                       ROA |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
              -------------+----------------------------------------------------------------
                       ROA |
                       L1. |   .3995683   .1771008     2.26   0.025     .0507821    .7483546
                           |
                        BS |   .1314227   .0407605     3.22   0.001      .051148    .2116974
                       BS2 |  -.0051375   .0016141    -3.18   0.002    -.0083164   -.0019587
                   BSFEper |  -.2927911   .0858397    -3.41   0.001    -.4618457   -.1237366
                     BS2FE |   .0115395   .0034741     3.32   0.001     .0046976    .0183813
                     FEper |   1.764199   .5179807     3.41   0.001     .7440759    2.784322
                     IDper |  -.5324772   .2107591    -2.53   0.012    -.9475509   -.1174036
                   perIDFE |  -.9383621    .409526    -2.29   0.023    -1.744892   -.1318324
                  IDFEINTT |   1.130017   .5078143     2.23   0.027     .1299163    2.130118
                      lnTA |   .0020307   .0032752     0.62   0.536    -.0044195    .0084809
                      year |  -.0007071   .0003499    -2.02   0.044    -.0013963    -.000018
                  leverage |  -.1072231   .1108108    -0.97   0.334    -.3254564    .1110103
              RiskAdjust~i |  -.0005055   .0008141    -0.62   0.535    -.0021087    .0010978
                    crisis |   -.017958   .0078617    -2.28   0.023    -.0334411    -.002475
                     _cons |   1.078268   .7077728     1.52   0.129    -.3156353    2.472172
              ------------------------------------------------------------------------------
              Instruments for orthogonal deviations equation
                GMM-type (missing=0, separate instruments for each period unless collapsed)
                  L(0/2).(L.ROA L.BS L.FEper L.IDper L.perIDFE L.leverage
                  L.RiskAdjustedCapitalRatioTi) collapsed
              Instruments for levels equation
                Standard
                  year
                  _cons
                GMM-type (missing=0, separate instruments for each period unless collapsed)
                  DL.(L.ROA L.BS L.FEper L.IDper L.perIDFE L.leverage
                  L.RiskAdjustedCapitalRatioTi) collapsed
              ------------------------------------------------------------------------------
              Arellano-Bond test for AR(1) in first differences: z =  -2.79  Pr > z =  0.005
              Arellano-Bond test for AR(2) in first differences: z =  -0.99  Pr > z =  0.324
              ------------------------------------------------------------------------------
              Sargan test of overid. restrictions: chi2(15)   =  19.10  Prob > chi2 =  0.209
                (Not robust, but not weakened by many instruments.)
              Hansen test of overid. restrictions: chi2(15)   =  14.20  Prob > chi2 =  0.511
                (Robust, but weakened by many instruments.)
              
              Difference-in-Hansen tests of exogeneity of instrument subsets:
                GMM instruments for levels
                  Hansen test excluding group:     chi2(8)    =   8.79  Prob > chi2 =  0.360
                  Difference (null H = exogenous): chi2(7)    =   5.41  Prob > chi2 =  0.611
                iv(year, eq(level))
                  Hansen test excluding group:     chi2(14)   =  12.91  Prob > chi2 =  0.533
                  Difference (null H = exogenous): chi2(1)    =   1.28  Prob > chi2 =  0.257
              Once again, your help is greatly appreciated, Thank you!

              Comment


              • #8
                xtabond2 does not provide tests for instrument strength. The Hansen test in your case does not provide evidence of instrument invalidity. The Arellano-Bond AR(2) test also supports the correct model specification. The output given does not provide reason for concern.

                However, notice that your model is necessarily misspecified by your use of the lag(2 0) option. The second value should normally be larger than the first value, although it seems that xtabond2 is flipping the values automatically. The list of instruments underneath the regression output suggests that L0.L.ROA = L.ROA was used as an instrument. This lagged dependent variable is actually invalid as an instrument by construction of the model. You would need to start at least with the second lag of the dependent variabe (i.e., the first lag of the lagged dependent variable). You can achieve this, for example, by using the option lag(1 3) instead.
                https://www.kripfganz.de/stata/

                Comment


                • #9
                  okay great, Thank you very much for all of your help Sebastian

                  Comment

                  Working...
                  X