Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • XTABOND2 number of instruments

    I am struggling with GMM and XTABOND2. Authors of academic papers don't mention how many instruments they get after the computation of the model. As I understand the number of instruments cannot be greater than the number of groups. I read the materials from Roodman (2009) and Sebastian Kripfganz warning about the number of instruments. I am still not clear and have a lot of doubts about my output. My data have N=38 and T=30. 1 dependent and 7 independent variables. I am using 4 exogenous as instruments. I am coding the following:

    xtabond2 gdp L.gdp fdi gfcf iq hc fd ele inf, gmm(L.gdp fdi gfcf inf, laglimits(2 2) eq(level) collapse) gmm(L.gdp fdi gfcf inf, laglimits(0 0)eq(diff) collapse) iv( ele hc iq fd, eq(level)) twostep robust nodiffsargan


    Dynamic panel-data estimation, two-step system GMM
    ------------------------------------------------------------------------------
    Group variable: egcode Number of obs = 1102
    Time variable : year Number of groups = 38
    Number of instruments = 13 Obs per group: min = 29
    Wald chi2(8) = 2416.01 avg = 29.00
    Prob > chi2 = 0.000 max = 29
    ------------------------------------------------------------------------------
    | Corrected
    gdppc | Coef. Std. Err. z P>|z| [95% Conf. Interval]
    -------------+----------------------------------------------------------------
    gdppc |
    L1. | -.9990342 .0438995 -22.76 0.000 -1.085076 -.9129927

    fdi | .2112965 .2501171 0.84 0.398 -.2789239 .701517
    gfcf | .0605189 .2211572 0.27 0.784 -.3729413 .4939791
    iq | 3.318989 1.314757 2.52 0.012 .7421124 5.895865
    hc | 2.26246 .7812645 2.90 0.004 .7312095 3.79371
    fd | 2.798065 1.395826 2.00 0.045 .0622961 5.533834
    ele | -1.650007 .4392249 -3.76 0.000 -2.510872 -.7891425
    inf | -.000655 .000234 -2.80 0.005 -.0011135 -.0001964
    _cons | 22.31125 8.096417 2.76 0.006 6.442562 38.17993
    ------------------------------------------------------------------------------
    Instruments for first differences equation
    GMM-type (missing=0, separate instruments for each period unless collapsed)
    .(L.gdppc fdi gfcf inflation) collapsed
    Instruments for levels equation
    Standard
    lnele lnhc lniq lnfd
    _cons
    GMM-type (missing=0, separate instruments for each period unless collapsed)
    DL2.(L.gdppc fdi gfcf inflation) collapsed
    ------------------------------------------------------------------------------
    Arellano-Bond test for AR(1) in first differences: z = 3.42 Pr > z = 0.001
    Arellano-Bond test for AR(2) in first differences: z = -4.18 Pr > z = 0.000
    ------------------------------------------------------------------------------
    Sargan test of overid. restrictions: chi2(4) = 19.21 Prob > chi2 = 0.001
    (Not robust, but not weakened by many instruments.)
    Hansen test of overid. restrictions: chi2(4) = 6.69 Prob > chi2 = 0.153
    (Robust, but weakened by many instruments.)



    I am estimating the model and focus a lot on the number of instruments, is it the right approach?



    Best regards,

    Gao Yili








  • #2
    Given your relatively small cross-sectional dimension, I believe that it is indeed a good idea to focus a lot on keeping the number of instruments small.

    In your particular case, notice that gmm(L.gdp, lagrange(0 0) eq(diff)) does not yield a valid instrument. This creates the first lag of gdp (the zeroth lag of L.gdp) as an instrument for the first-differenced model, but only the second lag of gdp (and lags further away) is a valid instrument. You should thus choose gmm(L.gdp, lagrange(1 1) eq(diff)).

    The other way round, gmm(L.gdp, laglimits(2 2) eq(level)) is potentially going to far back in time. gmm(L.gdp, laglimits(0 0) eq(level)) would be valid as well for the level model (provided that there is no serial correlation in the idiosyncratic errors).

    The coefficient of your lagged dependent variable, -.9990342, is worrying. This should normally fall into the interval from 0 to plus 1. Maybe the problem gets resolved, when you modify your instruments according to my suggestion.
    https://www.kripfganz.de/stata/

    Comment


    • #3
      Hi Sebastian,

      Thanks for your reply and suggestion! I really appreciate it.

      I made the changes according to your suggestion and I got the following output:

      xtabond2 gdppc L.gdppc fdi gfcf iq hc fd ele inf, gmm(L.gdppc fdi gfcf inf, laglimits(2 2)eq(level)collapse) gmm(L.gdppc fdi gfcf inf, laglimits(1 1)eq(diff) collapse) iv(ele hc iq fd, eq(level)) twostep robust nodiffsargan


      Dynamic panel-data estimation, two-step system GMM
      ------------------------------------------------------------------------------
      Group variable: egcode Number of obs = 1102
      Time variable : year Number of groups = 38
      Number of instruments = 13 Obs per group: min = 29
      Wald chi2(8) = 148.48 avg = 29.00
      Prob > chi2 = 0.000 max = 29
      ------------------------------------------------------------------------------
      | Corrected
      gdppc | Coef. Std. Err. z P>|z| [95% Conf. Interval]
      -------------+----------------------------------------------------------------
      gdppc |
      L1. | .2025764 .1675159 1.21 0.227 -.1257487 5309016
      |
      fdi | .0436943 .1142995 0.38 0.702 -.1803287 2677172
      gfcf | .0795385 .1309798 0.61 0.544 -.1771772 .3362543
      iq | .65937 .9067753 0.73 0.467 -1.117877 2.436617
      hc | 1.02063 . 3800626 2.69 0.007 .2757213 1.765539
      fd | .9084523 .7697381 1.18 0.238 -.6002066 2.417111
      ele | -.56887 .2474581 -2.30 0.022 -1.053879 -.0838609
      inf | -.0023428 .000323 -7.25 0.000 -.0029759 -.0017096
      _cons | 6.655052 4.72962 1.41 0.159 -2.614833 15.92494
      ------------------------------------------------------------------------------
      Instruments for first differences equation
      GMM-type (missing=0, separate instruments for each period unless collapsed)
      L.(L.gdppc fdi gfcf inflation) collapsed
      Instruments for levels equation
      Standard
      lnele lnhc lniq lnfd
      _cons
      GMM-type (missing=0, separate instruments for each period unless collapsed)
      DL2.(L.gdppc fdi gfcf inflation) collapsed
      ------------------------------------------------------------------------------
      Arellano-Bond test for AR(1) in first differences: z = -3.08 Pr > z = 0.002
      Arellano-Bond test for AR(2) in first differences: z = -0.23 Pr > z = 0.816
      ------------------------------------------------------------------------------
      Sargan test of overid. restrictions: chi2(4) = 26.85 Prob > chi2 = 0.000
      (Not robust, but not weakened by many instruments.)
      Hansen test of overid. restrictions: chi2(4) = 13.21 Prob > chi2 = 0.010
      (Robust, but weakened by many instruments.)


      xtabond2 gdppc L.gdppc fdi gfcf iq hc fd ele inf, gmm(L.gdppc fdi gfcf inf, laglimits(0 0)eq(level)collapse) gmm(L.gdppc fdi gfcf inf, laglimits(1 1)eq(diff) collapse) iv(ele hc iq fd, eq(level)) twostep robust nodiffsargan


      Dynamic panel-data estimation, two-step system GMM
      ------------------------------------------------------------------------------
      Group variable: egcode Number of obs = 1102
      Time variable : year Number of groups = 38
      Number of instruments = 13 Obs per group: min = 29
      Wald chi2(8) = 52.60 avg = 29.00
      Prob > chi2 = 0.000 max = 29
      ------------------------------------------------------------------------------
      | Corrected
      gdppc | Coef. Std. Err. z P>|z| [95% Conf. Interval]
      -------------+----------------------------------------------------------------
      gdppc |
      L1. | .1953844 . 1214648 1.61 0.108 -.0426822 . 433451
      |
      fdi | . 0349125 .0870876 0.40 0.689 -.1357761 .2056011
      gfcf | .5608869 .1920085 2.92 0.003 .1845571 .9372166
      iq | -2.090898 1.966146 -1.06 0.288 -5.944474 1.762678
      hc | .4086098 .8994869 0.45 0.650 -1.354352 2.171572
      fd | -.5632829 1.273709 -0.44 0.658 -3.059706 1.933141
      ele | -.1713473 .282975 -0.61 0.545 -.7259682 .3832736
      inf | -.0024796 .000533 -4.65 0.000 -.0035243 -.001435
      _cons | -10.65992 7.720079 -1.38 0.167 -25.791 4.471153
      ------------------------------------------------------------------------------
      Instruments for first differences equation
      GMM-type (missing=0, separate instruments for each period unless collapsed)
      L.(L.gdppc fdi gfcf inflation) collapsed
      Instruments for levels equation
      Standard
      lnele lnhc lniq lnfd
      _cons
      GMM-type (missing=0, separate instruments for each period unless collapsed)
      D.(L.gdppc fdi gfcf inflation) collapsed
      ------------------------------------------------------------------------------
      Arellano-Bond test for AR(1) in first differences: z = -3.73 Pr > z = 0.000
      Arellano-Bond test for AR(2) in first differences: z = -0.85 Pr > z = 0.398
      ------------------------------------------------------------------------------
      Sargan test of overid. restrictions: chi2(4) = 21.32 Prob > chi2 = 0.000
      (Not robust, but not weakened by many instruments.)
      Hansen test of overid. restrictions: chi2(4) = 5.73 Prob > chi2 = 0.220
      (Robust, but weakened by many instruments.)


      Comment


      • #4
        Your second specification looks very reasonable and both the AR(2) and the Hansen test support the specification.
        https://www.kripfganz.de/stata/

        Comment


        • #5
          Thanks!

          One more question, I am considering adding time dummies, but without going too far with the number of instruments or variables, I have T=30.

          I tried different ways but the output was not good at all.

          Any suggestion?

          Comment


          • #6
            Your T is relatively large in comparison to N. It is not surprising that adding a separate time dummy for each period yields troublesome results. You could try to add dummies for, say, 5-year periods instead, or a few dummies for crisis years etc. Alternatively, you could add some variables that capture global economic developments (varying over time but constant across countries).
            https://www.kripfganz.de/stata/

            Comment


            • #7
              Very helpful! Thanks!

              Comment


              • #8
                Hi Sebastian.

                After including some time dummies in the specification I only get better results using iv compound option: iv( eq(diff) ) and iv(eq(level)), is it a right approach?


                ivstyle( eq(level))

                Code:
                 xtabond2 gdp L.gdp gfcf fdi iq hc fd ele inf  y1990 y1991 y2001 y2007 y2009, gmm(L.gdp gfcf fdi inf, laglimits(0 0) eq(level) collapse) gmm(L.gdp gfcf fdi inf, eq(diff) laglimits(1 1)collapse)  iv( hc iq fd ele y1990 y1991 y2001 y2007 y2009, eq(level)) twostep robust nodiffsargan orthogonal
                Code:
                Dynamic panel-data estimation, two-step system GMM
                ------------------------------------------------------------------------------
                Group variable: egcode                          Number of obs      =      1085
                Time variable : year                            Number of groups   =        38
                Number of instruments = 18                      Obs per group: min =        23
                Wald chi2(13) =    452.65                                      avg =     28.55
                Prob > chi2   =     0.000                                      max =        29
                ------------------------------------------------------------------------------
                             |              Corrected
                         gdp |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
                -------------+----------------------------------------------------------------
                         gdp |
                         L1. |   .0755628   .0719789     1.05   0.294    -.0655133    .2166388
                             |
                        gfcf |   .0734045   .0152234     4.82   0.000     .0435673    .1032417
                         fdi |   .0001025   .0001032     0.99   0.320    -.0000997    .0003047
                          iq |   .0282243   .0156157     1.81   0.071    -.0023819    .0588305
                          hc |  -.1564944   .0893842    -1.75   0.080    -.3316842    .0186954
                          fd |   .0150725   .0096058     1.57   0.117    -.0037544    .0338994
                         ele |   .1405175   .0343007     4.10   0.000     .0732894    .2077457
                         inf |  -.0013352   .0002696    -4.95   0.000    -.0018635   -.0008069
                       y1990 |  -.0413191   .6618428    -0.06   0.950    -1.338507    1.255869
                       y1991 |   .2146626   .5157637     0.42   0.677    -.7962157    1.225541
                       y2001 |  -1.855946   .3512207    -5.28   0.000    -2.544326   -1.167566
                       y2007 |   1.515472   .2610916     5.80   0.000     1.003742    2.027202
                       y2009 |  -2.730804   .5811994    -4.70   0.000    -3.869934   -1.591674
                       _cons |   2.057906    .304801     6.75   0.000     1.460507    2.655305
                ------------------------------------------------------------------------------
                Instruments for orthogonal deviations equation
                  GMM-type (missing=0, separate instruments for each period unless collapsed)
                    L.(L.gdp gfcf fdi inf) collapsed
                Instruments for levels equation
                  Standard
                    hc iq fd ele y1990 y1991 y2001 y2007 y2009
                    _cons
                  GMM-type (missing=0, separate instruments for each period unless collapsed)
                    D.(L.gdp gfcf fdi inf) collapsed
                ------------------------------------------------------------------------------
                Arellano-Bond test for AR(1) in first differences: z =  -3.98  Pr > z =  0.000
                Arellano-Bond test for AR(2) in first differences: z =   0.40  Pr > z =  0.690
                ------------------------------------------------------------------------------
                Sargan test of overid. restrictions: chi2(4)    =   0.66  Prob > chi2 =  0.956
                  (Not robust, but not weakened by many instruments.)
                Hansen test of overid. restrictions: chi2(4)    =   0.84  Prob > chi2 =  0.933
                  (Robust, but weakened by many instruments.)

                Compound ivstyle

                Code:
                xtabond2 gdp L.gdp gfcf fdi iq hc fd ele inf  y1990 y1991 y2001 y2007 y2009, gmm(L.gdp gfcf fdi inf, laglimits(0 0) eq(level) collapse) gmm(L.gdp gfcf fdi inf, eq(diff) laglimits(1 1)collapse)  iv( hc iq fd ele y1990 y1991 y2001 y2007 y2009, eq(level))  iv( hc iq fd ele y1990 y1991 y2001 y2007 y2009, eq(diff))twostep robust nodiffsargan orthogonal
                Code:
                Dynamic panel-data estimation, two-step system GMM
                ------------------------------------------------------------------------------
                Group variable: egcode                          Number of obs      =      1085
                Time variable : year                            Number of groups   =        38
                Number of instruments = 25                      Obs per group: min =        23
                Wald chi2(13) =    329.28                                      avg =     28.55
                Prob > chi2   =     0.000                                      max =        29
                ------------------------------------------------------------------------------
                             |              Corrected
                         gdp |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
                -------------+----------------------------------------------------------------
                         gdp |
                         L1. |   .1778847   .1344312     1.32   0.186    -.0855955     .441365
                             |
                        gfcf |   .0640454   .0212203     3.02   0.003     .0224543    .1056365
                         fdi |   .0001699   .0001774     0.96   0.338    -.0001778    .0005177
                          iq |    .030694     .01824     1.68   0.092    -.0050557    .0664437
                          hc |  -.1938678   .0902754    -2.15   0.032    -.3708043   -.0169314
                          fd |    .004453   .0160067     0.28   0.781    -.0269195    .0358256
                         ele |   .1404895   .0381513     3.68   0.000     .0657143    .2152647
                         inf |  -.0012767    .000335    -3.81   0.000    -.0019333     -.00062
                       y1990 |   .5238467   .9235504     0.57   0.571    -1.286279    2.333972
                       y1991 |  -.0357782   .7253139    -0.05   0.961    -1.457367    1.385811
                       y2001 |    -1.9954   .5407573    -3.69   0.000    -3.055265   -.9355348
                       y2007 |   1.312422   .3421616     3.84   0.000     .6417976    1.983046
                       y2009 |   -3.05845   .7919194    -3.86   0.000    -4.610584   -1.506317
                       _cons |   1.976166   .3531009     5.60   0.000     1.284101    2.668231
                ------------------------------------------------------------------------------
                Instruments for orthogonal deviations equation
                  Standard
                    FOD.(hc iq fd ele y1990 y1991 y2001 y2007 y2009)
                  GMM-type (missing=0, separate instruments for each period unless collapsed)
                    L.(L.gdp gfcf fdi inf) collapsed
                Instruments for levels equation
                  Standard
                    hc iq fd ele y1990 y1991 y2001 y2007 y2009
                    _cons
                  GMM-type (missing=0, separate instruments for each period unless collapsed)
                    D.(L.gdp gfcf fdi inf) collapsed
                ------------------------------------------------------------------------------
                Arellano-Bond test for AR(1) in first differences: z =  -3.25  Pr > z =  0.001
                Arellano-Bond test for AR(2) in first differences: z =   0.65  Pr > z =  0.517
                ------------------------------------------------------------------------------
                Sargan test of overid. restrictions: chi2(11)   =  75.33  Prob > chi2 =  0.000
                  (Not robust, but not weakened by many instruments.)
                Hansen test of overid. restrictions: chi2(11)   =  17.45  Prob > chi2 =  0.095
                  (Robust, but weakened by many instruments.)

                Comment


                • #9
                  Why do you believe the compound specification gives "better" results?

                  To me, your first results look perfectly fine. There is no good justification for adding the time dummies as instruments for the first-differenced model as well.
                  https://www.kripfganz.de/stata/

                  Comment


                  • #10
                    Hi Sebastian, Thanks for your reply!

                    The P-value of Hansen test over 0.25. is not a sign of trouble?

                    Code:
                    Hansen test of overid. restrictions: chi2(4) = 0.84 Prob > chi2 = 0.933

                    Comment


                    • #11
                      Large p-values of the Hansen test would be a sign of trouble if there is a reason to suspect a too-many-instruments problem (which is actually a problem of too many overidentifying restrictions). Here, you just have 4 overidentifying restrictions (the degrees of freedom of the Hansen test). Personally, I would not worry about the large p-value in your example.
                      https://www.kripfganz.de/stata/

                      Comment


                      • #12
                        it is clear now, Many thanks!

                        Comment


                        • #13
                          Along the same lines as Gao Yili, I'd like to have some support on the following xtabond2 code, knowing that :
                          • equityassets L.pretaxprofitonassets expensesrevenues offbalance are endogenous
                          • totalassets GDP concentration_3 domcredit are predetermined
                          • shortrate is exogenous
                          Code:
                            xi: xtabond2 NPLsgrossloans L.NPLsgrossloans shortrate equityassets L.pretaxprofitonassets expensesrevenues offbalance totalassets GDP concentration_3 domcredit i.year country_n, gmm(L.NPLsgrossloans equityassets L.pretaxprofitonassets expensesrevenues offbalance, laglimits(0 0) eq(level) collapse) gmm(L.NPLsgrossloans equityassets L.pretaxprofitonassets expensesrevenues offbalance, laglimits(1 1) eq(diff) collapse) iv(shortrate i.year country_n, eq(level)) twostep cluster(entity_n) nodiffsargan
                          I get the following results :

                          Code:
                          i.year            _Iyear_2009-2017    (naturally coded; _Iyear_2009 omitted)
                          Favoring speed over space. To switch, type or click on mata: mata set matafavor space, perm.
                          _Iyear_2011 dropped due to collinearity
                          _Iyear_2016 dropped due to collinearity
                          Warning: Two-step estimated covariance matrix of moments is singular.
                            Using a generalized inverse to calculate optimal weighting matrix for two-step estimation.
                          
                          Dynamic panel-data estimation, two-step system GMM
                          ------------------------------------------------------------------------------
                          Group variable: entity_n                        Number of obs      =     12652
                          Time variable : year                            Number of groups   =      2788
                          Number of instruments = 19                      Obs per group: min =         1
                          Wald chi2(17) =   1866.13                                      avg =      4.54
                          Prob > chi2   =     0.000                                      max =         8
                                                                 (Std. Err. adjusted for clustering on entity_n)
                          --------------------------------------------------------------------------------------
                                               |              Corrected
                                NPLsgrossloans |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
                          ---------------------+----------------------------------------------------------------
                                NPLsgrossloans |
                                           L1. |    .911101   .2967232     3.07   0.002     .3295342    1.492668
                                               |
                                     shortrate |    .162101   1.212265     0.13   0.894    -2.213896    2.538098
                                  equityassets |   .0304096   .2353199     0.13   0.897     -.430809    .4916281
                                               |
                          pretaxprofitonassets |
                                           L1. |  -.6053552   .5478198    -1.11   0.269    -1.679062    .4683519
                                               |
                              expensesrevenues |  -.0003392   .0021364    -0.16   0.874    -.0045264    .0038481
                                    offbalance |    .157122   .1179013     1.33   0.183    -.0739602    .3882043
                                   totalassets |    .000068    .000475     0.14   0.886    -.0008631    .0009991
                                           GDP |  -1.198131    1.95335    -0.61   0.540    -5.026627    2.630364
                               concentration_3 |   54.25937   48.43717     1.12   0.263    -40.67575    149.1945
                                     domcredit |  -.1925165   .2209086    -0.87   0.383    -.6254893    .2404563
                                   _Iyear_2010 |  -.1184613   1.880578    -0.06   0.950    -3.804327    3.567404
                                   _Iyear_2012 |  -2.562553   4.496527    -0.57   0.569    -11.37558    6.250477
                                   _Iyear_2013 |  -1.232831   5.699771    -0.22   0.829    -12.40418    9.938514
                                   _Iyear_2014 |   -.059447   1.889238    -0.03   0.975    -3.762285    3.643391
                                   _Iyear_2015 |   .3941095   1.017727     0.39   0.699    -1.600598    2.388817
                                   _Iyear_2017 |  -.8067855   1.430191    -0.56   0.573    -3.609908    1.996337
                                     country_n |  -.6561961   1.332347    -0.49   0.622    -3.267548    1.955156
                                         _cons |   1.387771   15.62793     0.09   0.929     -29.2424    32.01794
                          --------------------------------------------------------------------------------------
                          Instruments for first differences equation
                            GMM-type (missing=0, separate instruments for each period unless collapsed)
                              L.(L.NPLsgrossloans equityassets L.pretaxprofitonassets expensesrevenues
                              offbalance) collapsed
                          Instruments for levels equation
                            Standard
                              shortrate _Iyear_2010 _Iyear_2011 _Iyear_2012 _Iyear_2013 _Iyear_2014
                              _Iyear_2015 _Iyear_2016 _Iyear_2017 country_n
                              _cons
                            GMM-type (missing=0, separate instruments for each period unless collapsed)
                              D.(L.NPLsgrossloans equityassets L.pretaxprofitonassets expensesrevenues
                              offbalance) collapsed
                          ------------------------------------------------------------------------------
                          Arellano-Bond test for AR(1) in first differences: z =  -1.92  Pr > z =  0.055
                          Arellano-Bond test for AR(2) in first differences: z =   0.41  Pr > z =  0.683
                          ------------------------------------------------------------------------------
                          Sargan test of overid. restrictions: chi2(1)    =   2.70  Prob > chi2 =  0.101
                            (Not robust, but not weakened by many instruments.)
                          Hansen test of overid. restrictions: chi2(1)    =   0.56  Prob > chi2 =  0.454
                            (Robust, but weakened by many instruments.)
                          I'm particularly worried about the fact all regressors are not signifiant, excep the autoregressive one.

                          Thank your very much.

                          Comment


                          • #14
                            Originally posted by Bruno De Menna View Post
                            I'm particularly worried about the fact all regressors are not signifiant, excep the autoregressive one.
                            That is a typical issue in dynamic panel models when the coefficient of the lagged dependent variable is close to 1. It is generally difficult to find strong predictors of a variable that is highly persistent, after controlling for the unit-specific effects (which is done implicitly by construction of the GMM estimator).

                            I am afraid there is not really a recommendation I can make to improve the situation.
                            https://www.kripfganz.de/stata/

                            Comment


                            • #15
                              Once final question Sebastian :

                              According to you, what would be the rule of thumb to choose between 'eq(level)' and 'eq(diff)' options once trying to set suitable instruments in a GMM specification (in particular regarding endogenous and predetermined variables) ?

                              Thank you.

                              Comment

                              Working...
                              X