Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Problem with Hansen test (xtabond2, gmm)

    Dear users,

    I have a question about xtabond2 (Roodman, 2009), especially on the Hansen-test:

    I'm using unbalanced panel data with T=7 and n=114760.

    I'm trying to estimate the following regression:
    xtabond2 tangrowth L.tangrowth L.levratio llevchange L.cashflow maturity1 L.logsales L.logassets tdum4-tdum7, gmm(tangrowth, lag (1 .)) gmm(levratio maturity1, lag (2 .)) iv(logassets cashflow logsales, eq(level)) iv(logassets cashflow logsales, eq(diff)) iv(tdum4-tdum7, eq(level)) robust small twostep nodiff artests(3)

    It produces the following output:
    Click image for larger version

Name:	Image.png
Views:	1
Size:	67.0 KB
ID:	1460127


    Any suggestions/comments on how can I improve the estimation?
    (I also tried to increase the number of lags in the dependent variable but the hansen-test doesn't pass)

    Thanks in advance!

  • #2
    Your model seems to suffer from serially correlated errors, as indicated by the Arellano-Bond AR(2) and AR(3) tests. It might be that this is a consequence of lagging all of the independent variables, but this is just my speculation. Your instruments become invalid under serial correlation of the errors but using deeper lags is not really an option given your short time horizon.

    Note that even without serially correlated errors the instruments gmm(tangrowth, lag (1 .)) are invalid. The first lag of the dependent variable is correlated with the first-differenced errors by construction. Similarly, the contemporaneous differences are correlated with the level errors by construction. The first valid lag (under absence of serial correlation) would be the second lag of the dependent variable.
    https://twitter.com/Kripfganz

    Comment


    • #3
      Dear Sebastian, thank you very much for your suggestions and the explanation on the lag of the dependent variable!
      Indeed, after removing the lags on the independent variables, it appears that I no longer suffer from serially correlated errors:

      Code:
      xtabond2 tangrowth L.tangrowth levratio levratio_2 cashflow cash2 maturity1 logassets tdum4-tdum7, gmm(tangrowth, lag (2 .)) gmm(levratio cashflow maturity1, lag (2 .)) iv(logassets cash2, eq(level)) iv(logassets cash2, eq(diff)) iv(tdum4-tdum7, eq(level)) robust small twostep artests(3)
      Output:
      Code:
      Dynamic panel-data estimation, two-step system GMM
      ------------------------------------------------------------------------------
      Group variable: firm                            Number of obs      =    325531
      Time variable : year                            Number of groups   =    115104
      Number of instruments = 80                      Obs per group: min =         1
      F(11, 115103) =    344.93                                      avg =      2.83
      Prob > F      =     0.000                                      max =         5
      ------------------------------------------------------------------------------
                   |              Corrected
         tangrowth |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
      -------------+----------------------------------------------------------------
         tangrowth |
               L1. |   .0409069    .003384    12.09   0.000     .0342744    .0475394
                   |
          levratio |    .512727    .068276     7.51   0.000     .3789071     .646547
        levratio_2 |  -.2591215   .0422063    -6.14   0.000    -.3418452   -.1763978
          cashflow |   1.486181   .0652855    22.76   0.000     1.358223     1.61414
             cash2 |   -.257119   .0225267   -11.41   0.000     -.301271   -.2129671
         maturity1 |   .0589028   .0173661     3.39   0.001     .0248654    .0929401
         logassets |  -.0325585   .0023015   -14.15   0.000    -.0370693   -.0280477
             tdum4 |   .0152197   .0028621     5.32   0.000     .0096101    .0208293
             tdum5 |    .047843    .003385    14.13   0.000     .0412085    .0544776
             tdum6 |   .0512466   .0041504    12.35   0.000     .0431119    .0593814
             tdum7 |   .0598833   .0044382    13.49   0.000     .0511846    .0685821
             _cons |   .3012261   .0435903     6.91   0.000     .2157898    .3866624
      ------------------------------------------------------------------------------
      Instruments for first differences equation
        Standard
          D.(logassets cash2)
        GMM-type (missing=0, separate instruments for each period unless collapsed)
          L(2/6).(levratio cashflow maturity1)
          L(2/6).tangrowth
      Instruments for levels equation
        Standard
          tdum4 tdum5 tdum6 tdum7
          logassets cash2
          _cons
        GMM-type (missing=0, separate instruments for each period unless collapsed)
          DL.(levratio cashflow maturity1)
          DL.tangrowth
      ------------------------------------------------------------------------------
      Arellano-Bond test for AR(1) in first differences: z = -56.58  Pr > z =  0.000
      Arellano-Bond test for AR(2) in first differences: z =   1.51  Pr > z =  0.132
      Arellano-Bond test for AR(3) in first differences: z =  -0.48  Pr > z =  0.629
      ------------------------------------------------------------------------------
      Sargan test of overid. restrictions: chi2(68)   =2986.15  Prob > chi2 =  0.000
        (Not robust, but not weakened by many instruments.)
      Hansen test of overid. restrictions: chi2(68)   = 990.26  Prob > chi2 =  0.000
        (Robust, but weakened by many instruments.)
      
      Difference-in-Hansen tests of exogeneity of instrument subsets:
        GMM instruments for levels
          Hansen test excluding group:     chi2(48)   = 246.51  Prob > chi2 =  0.000
          Difference (null H = exogenous): chi2(20)   = 743.75  Prob > chi2 =  0.000
        gmm(tangrowth, lag(2 .))
          Hansen test excluding group:     chi2(56)   = 948.94  Prob > chi2 =  0.000
          Difference (null H = exogenous): chi2(12)   =  41.32  Prob > chi2 =  0.000
        gmm(levratio cashflow maturity1, lag(2 .))
          Hansen test excluding group:     chi2(9)    =  37.71  Prob > chi2 =  0.000
          Difference (null H = exogenous): chi2(59)   = 952.55  Prob > chi2 =  0.000
        iv(logassets cash2, eq(level))
          Hansen test excluding group:     chi2(66)   = 717.85  Prob > chi2 =  0.000
          Difference (null H = exogenous): chi2(2)    = 272.41  Prob > chi2 =  0.000
        iv(logassets cash2, eq(diff))
          Hansen test excluding group:     chi2(66)   = 548.37  Prob > chi2 =  0.000
          Difference (null H = exogenous): chi2(2)    = 441.89  Prob > chi2 =  0.000
        iv(tdum4 tdum5 tdum6 tdum7, eq(level))
          Hansen test excluding group:     chi2(64)   = 785.31  Prob > chi2 =  0.000
          Difference (null H = exogenous): chi2(4)    = 204.95  Prob > chi2 =  0.000
      However, now I'm trying to solve the problem of exogeneity in the instruments subset. I was reading a previous post of yours (https://www.statalist.org/forums/for...fference/page2) and considering my output I'm inclined to say that the specification of endogenous/exogenous variables may not be correct, is this rationale right?

      Thanks in advance!

      Comment


      • #4
        That could be a reason. Generally, it is difficult to say what exactly is causing the problem. It could be a general model misspecification such as omitted variables.

        Note that the instrument specification iv(logassets cash2, eq(level)) assumes that these two variables are uncorrelated with the unobserved "fixed effects" because they enter untransformed in the level equation.
        https://twitter.com/Kripfganz

        Comment


        • #5
          Dear Sebastian, thank you very much for your answer.
          I'll reconsider my choice of endogenou/exogenous variables based on the economic theory and the objective of the model.

          Comment

          Working...
          X