Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • xtabond2

    Dear Statalist members,

    I have been trying to implement xtabond2 in my undergraduate thesis. I have decided that the first-difference of a democracy variable (polity) may not be a good instrument, since it will change slowly over time (i.e. many differences will be 0). Thus I would like to check I have told Stata to just used the lagged levels for this variable, rather than the differences as well.

    In my model, predetermined variables are: gdp0, inflation, govsize. Endogenous variables are: mgdp, polity, aid, aidpol.

    Also, is the warning Stata is providing something to be concerned over?

    Code:
    xtabond2 growth aid aidpol gdp0 polity inflation govsize mgdp i.year, gmm(gdp0 inflation govsize, lag (1 .)collapse) gmm(aid mgdp, lag(2 .)collapse) gmm(aidpol polity, lag(2 .) eq(level)) iv
    > (i.year, eq(level)) robust
    Favoring space over speed. To switch, type or click on mata: mata set matafavor speed, perm.
    Warning: Two-step estimated covariance matrix of moments is singular.
      Using a generalized inverse to calculate robust weighting matrix for Hansen test.
      Difference-in-Sargan/Hansen statistics may be negative.
    
    Dynamic panel-data estimation, one-step system GMM
    ------------------------------------------------------------------------------
    Group variable: countrynum                      Number of obs      =       348
    Time variable : year                            Number of groups   =        58
    Number of instruments = 46                      Obs per group: min =         6
    Wald chi2(13) =    258.70                                      avg =      6.00
    Prob > chi2   =     0.000                                      max =         6
    ------------------------------------------------------------------------------
                 |               Robust
          growth |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
             aid |  -.0574378   .1154779    -0.50   0.619    -.2837704    .1688947
          aidpol |    .043429   .0184724     2.35   0.019     .0072237    .0796343
            gdp0 |   1.976001   .8199689     2.41   0.016     .3688917    3.583111
          polity |  -.1539181   .2680458    -0.57   0.566    -.6792782     .371442
       inflation |   -.007488   .0097597    -0.77   0.443    -.0266166    .0116406
         govsize |  -.0630577   .1085896    -0.58   0.561    -.2758895    .1497741
            mgdp |   .1684272   .0204182     8.25   0.000     .1284083    .2084461
                 |
            year |
              1  |          0  (empty)
              2  |  -.5705357   .6530268    -0.87   0.382    -1.850445    .7093733
              3  |  -.9162101   .5882048    -1.56   0.119     -2.06907    .2366501
              4  |   .3565344   .5364327     0.66   0.506    -.6948544    1.407923
              5  |  -2.538261   .8779921    -2.89   0.004    -4.259094   -.8174283
              6  |  -3.248694   1.059512    -3.07   0.002      -5.3253   -1.172088
                 |
           _cons |  -15.53429   5.980947    -2.60   0.009    -27.25674   -3.811853
    ------------------------------------------------------------------------------
    Instruments for first differences equation
      GMM-type (missing=0, separate instruments for each period unless collapsed)
        L(2/5).(aid mgdp) collapsed
        L(1/5).(gdp0 inflation govsize) collapsed
    Instruments for levels equation
      Standard
        1b.year 2.year 3.year 4.year 5.year 6.year
        _cons
      GMM-type (missing=0, separate instruments for each period unless collapsed)
        DL(2/4).(aidpol polity)
        DL.(aid mgdp) collapsed
        D.(gdp0 inflation govsize) collapsed
    ------------------------------------------------------------------------------
    Arellano-Bond test for AR(1) in first differences: z =  -2.91  Pr > z =  0.004
    Arellano-Bond test for AR(2) in first differences: z =   0.61  Pr > z =  0.539
    ------------------------------------------------------------------------------
    Sargan test of overid. restrictions: chi2(32)   = 102.89  Prob > chi2 =  0.000
      (Not robust, but not weakened by many instruments.)
    Hansen test of overid. restrictions: chi2(32)   =  41.97  Prob > chi2 =  0.112
      (Robust, but weakened by many instruments.)
    
    Difference-in-Hansen tests of exogeneity of instrument subsets:
      GMM instruments for levels
        Hansen test excluding group:     chi2(15)   =  20.88  Prob > chi2 =  0.141
        Difference (null H = exogenous): chi2(17)   =  21.09  Prob > chi2 =  0.222
      gmm(gdp0 inflation govsize, collapse lag(1 .))
        Hansen test excluding group:     chi2(14)   =  22.00  Prob > chi2 =  0.079
        Difference (null H = exogenous): chi2(18)   =  19.97  Prob > chi2 =  0.334
      gmm(aid mgdp, collapse lag(2 .))
        Hansen test excluding group:     chi2(22)   =  38.83  Prob > chi2 =  0.015
        Difference (null H = exogenous): chi2(10)   =   3.14  Prob > chi2 =  0.978
      gmm(aidpol polity, eq(level) lag(2 .))
        Hansen test excluding group:     chi2(20)   =  30.11  Prob > chi2 =  0.068
        Difference (null H = exogenous): chi2(12)   =  11.86  Prob > chi2 =  0.457
      iv(1b.year 2.year 3.year 4.year 5.year 6.year, eq(level))
        Hansen test excluding group:     chi2(27)   =  38.58  Prob > chi2 =  0.069
        Difference (null H = exogenous): chi2(5)    =   3.39  Prob > chi2 =  0.639
    Many thanks in advance,

    Simon

  • #2
    A few remarks:
    1. Your number of instruments is (too) large relative to the number of groups / observations. This is the likely source of the warning message. (After all, it is still just a warning and not an error message but it hints toward some potential problems with your model.) I recommend to further restrict the maximum number of lags used as instruments and to rethink whether you really need to assume that 4 of your variables are endogenous. (The more endogenous variables you have, the more difficult it becomes to obtain reliable estimates because you are relying on the strength of your instruments that might quickly turn weak the further you lag them in time.)
    2. As you can see below your regression table, polity enters the set of instruments only in first-differenced form for the level equation. xtabond2 has a passthru suboption for gmm(). Alternatively, you can directly use the iv() option. In any case, keep in mind that using levels requires the assumption that policy is uncorrelated with the error term including the unit-specific effects.
    3. There is a severe bug in xtabond2 that causes the degrees of freedom for the Sargan / Hansen tests to be calculated incorrectly when you specify time dummies with the factor notation. Consequently, the p-values will be incorrect. As a workaround, create separate dummy variables for each time period first and then specify only those that do not get omitted. (Whenever xtabond2 reports coefficients to be "omitted" or "empty", this bug kicks in.) Alternatively, you can use my xtseqreg command instead that has a teffects option for the correct inclusion of time effects.
    4. I am worried a bit about your coefficient of gdp0. Assuming that the dependent variable is GDP growth and this is the log of lagged GDP, this coefficient should typically fall in the region between -1 and 0. Did you maybe forget to take the log of this variable?
    https://twitter.com/Kripfganz

    Comment


    • #3
      Hi Sebastian,

      Thank you for your useful comments. In regards to (2) and (3), have I successfully solved this?
      Code:
       xtabond2 growth aid aidpol gdp0 polity inflation govsize mgdp time2 time3 time4 time5 time6, gmm(gdp0 inflation govsize, lag (1 .)collapse) gmm(aid mgdp, lag(2 .)collapse) gmm(aidpol polity,
      > eq(level) p lag(2 .)) iv(time2 time3 time4 time5 time6, eq(level)) robust
      Favoring space over speed. To switch, type or click on mata: mata set matafavor speed, perm.
      
      Dynamic panel-data estimation, one-step system GMM
      ------------------------------------------------------------------------------
      Group variable: countrynum                      Number of obs      =       348
      Time variable : year                            Number of groups   =        58
      Number of instruments = 54                      Obs per group: min =         6
      Wald chi2(12) =    302.65                                      avg =      6.00
      Prob > chi2   =     0.000                                      max =         6
      ------------------------------------------------------------------------------
                   |               Robust
            growth |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
      -------------+----------------------------------------------------------------
               aid |  -.0731343   .1269097    -0.58   0.564    -.3218728    .1756042
            aidpol |    .032537   .0150912     2.16   0.031     .0029587    .0621153
              gdp0 |   1.664245   1.043606     1.59   0.111    -.3811862    3.709676
            polity |  -.2180305   .1440524    -1.51   0.130    -.5003681    .0643071
         inflation |  -.0071849   .0098774    -0.73   0.467    -.0265443    .0121745
           govsize |  -.0532686   .0953484    -0.56   0.576    -.2401481    .1336108
              mgdp |   .1647532   .0149169    11.04   0.000     .1355166    .1939899
             time2 |  -.2851089   .6310224    -0.45   0.651     -1.52189    .9516724
             time3 |  -.5975558   .6009205    -0.99   0.320    -1.775338    .5802268
             time4 |   .8827259   .5615615     1.57   0.116    -.2179144    1.983366
             time5 |  -1.847135   .8141298    -2.27   0.023      -3.4428   -.2514702
             time6 |   -2.53298   .9308799    -2.72   0.007    -4.357472   -.7084893
             _cons |  -13.33185   7.925602    -1.68   0.093    -28.86574    2.202045
      ------------------------------------------------------------------------------
      Instruments for first differences equation
        GMM-type (missing=0, separate instruments for each period unless collapsed)
          L(2/5).(aid mgdp) collapsed
          L(1/5).(gdp0 inflation govsize) collapsed
      Instruments for levels equation
        Standard
          time2 time3 time4 time5 time6
          _cons
        GMM-type (missing=0, separate instruments for each period unless collapsed)
          L(2/5).(aidpol polity)
          DL.(aid mgdp) collapsed
          D.(gdp0 inflation govsize) collapsed
      ------------------------------------------------------------------------------
      Arellano-Bond test for AR(1) in first differences: z =  -3.21  Pr > z =  0.001
      Arellano-Bond test for AR(2) in first differences: z =   0.75  Pr > z =  0.451
      ------------------------------------------------------------------------------
      Sargan test of overid. restrictions: chi2(41)   = 120.38  Prob > chi2 =  0.000
        (Not robust, but not weakened by many instruments.)
      Hansen test of overid. restrictions: chi2(41)   =  44.20  Prob > chi2 =  0.338
        (Robust, but weakened by many instruments.)
      
      Difference-in-Hansen tests of exogeneity of instrument subsets:
        GMM instruments for levels
          Hansen test excluding group:     chi2(16)   =  20.09  Prob > chi2 =  0.216
          Difference (null H = exogenous): chi2(25)   =  24.10  Prob > chi2 =  0.513
        gmm(gdp0 inflation govsize, collapse lag(1 .))
          Hansen test excluding group:     chi2(23)   =  25.70  Prob > chi2 =  0.315
          Difference (null H = exogenous): chi2(18)   =  18.50  Prob > chi2 =  0.423
        gmm(aid mgdp, collapse lag(2 .))
          Hansen test excluding group:     chi2(31)   =  41.85  Prob > chi2 =  0.092
          Difference (null H = exogenous): chi2(10)   =   2.35  Prob > chi2 =  0.993
        gmm(aidpol polity, passthru eq(level) lag(2 .))
          Hansen test excluding group:     chi2(21)   =  29.71  Prob > chi2 =  0.098
          Difference (null H = exogenous): chi2(20)   =  14.49  Prob > chi2 =  0.805
        iv(time2 time3 time4 time5 time6, eq(level))
          Hansen test excluding group:     chi2(36)   =  41.79  Prob > chi2 =  0.234
          Difference (null H = exogenous): chi2(5)    =   2.41  Prob > chi2 =  0.790
      As for the assumption, is this required if I were to use differences as well?

      4) Thank you for highlighting this. I have checked some empirical papers and those doing GMM sometimes get magnitudes greater than 1. However the sign is worrying - do you know of any other potential causes? gdp0 is the natural logarithm of GDP per capita at the beginning of each period (for example, if period 1 is 2000-04, then gdp0 in period 1 ln(GDPpc in 2000)

      Many thanks,

      Simon

      Comment


      • #4
        2) This looks all right. If you were using differences as instruments for the level equation (as you are doing for some variables), these differences need to be uncorrelated with the error term (and thus the unobserved country-specific effects). This is essentially an assumption on the initial observations; see Blundell and Bond (1998).
        3) Yes.
        4) Is the dependent variable the growth rate of GDP per capita or just GDP? Instead of directly using the growth rate, you might want to use the log-approximation of the growth rate, log(GDPt) - log(GDPt-1).
        https://twitter.com/Kripfganz

        Comment


        • #5
          Hi Sebastian,

          Thanks again. The dependent variable is the growth rate of GDP per capita, obtained from the World Bank. Do you think a move from growth rate to log-approximation could make such a large difference?

          Many thanks,

          Simon

          Comment


          • #6
            Depending on your sample of countries and years, there might be some extreme growth rates even in the range of plus/minus 40%. In that case, yes, it can make a big difference. There would be not much of a difference, if all growth rates were in a vicinity of zero.
            https://twitter.com/Kripfganz

            Comment


            • #7
              Hi Sebastian,

              Thank you for all your comments and guidance.

              Simon

              Comment


              • #8
                Hi all,

                Sorry to re-bump this thread; if you are doing difference-GMM is it still required to specify the eq(level) with the time dummies: e.g. iv(time2 time3 time4 time5 time6, eq(level))

                Many thanks,

                Simon

                Comment


                • #9
                  You should specify the equation for the time dummies, yes. The natural way is to specify them for the level equation. If you are using a difference GMM estimator with the noleveleq option, you can specify them instead for the first-difference equation.
                  https://twitter.com/Kripfganz

                  Comment


                  • #10
                    Hi Sebastian,

                    Are you then saying it should be iv(time2 time3...,eq(diff)) if specifying noleveleq?

                    Thanks,

                    Simon

                    Comment


                    • #11
                      Yes. (If you specify noleveleq, then you can actually ignore the eq(diff) suboption but it does not harm anyway.)
                      https://twitter.com/Kripfganz

                      Comment


                      • #12
                        Hi Sebastian,

                        Thank you again!

                        Simon

                        Comment


                        • #13
                          Dear All,

                          I have just registered this forum. I am running two step GMM for my phd thesis. I have 31 countries in my sample. However,Number of groups looks less than 31 after analysis. I will be glad if you can help me.

                          Thank you.

                          Comment

                          Working...
                          X