Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • xtabond2: Model Evaluation

    Dear Statalisters,

    I am working on determinants of corporate cash holdings with a panel dataset of ~700 firms across 16 years. Having gone through related literature, Roodman (2009), content on internet and posts on Statalist etc., I have developed the following model using xtabond2 for finding factors that influence cash holdings.

    Code:
    xtabond2 CashHoldings1 L.CashHoldings1 Size1 Leverage1 Liquidity1 Profitability4 GrowthPotential2 OperatingCashflow Dividend2 CapitalExpenditure1 CashFlowVol15years WPromoterSharesin1  i.Year, twostep small robust cluster(CompanyID) orthogonal artests(2) gmm(L.CashHoldings1, lag(2 2) eq(d)) gmm(Leverage1 Liquidity1 GrowthPotential2 Dividend2 CapitalExpenditure1, lag(2 5) eq(d) collapse) iv(Size1 Profitability4  WPromoterSharesin1 CashFlowVol15years OperatingCashflow i.Year, eq(l))
    Code:
    Dynamic panel-data estimation, two-step system GMM
    ------------------------------------------------------------------------------
    Group variable: CompanyID                       Number of obs      =      6991
    Time variable : Year                            Number of groups   =       671
    Number of instruments = 53                      Obs per group: min =         1
    F(27, 670)    =    173.51                                      avg =     10.42
    Prob > F      =     0.000                                      max =        15
                                             (Std. Err. adjusted for clustering on CompanyID)
    -----------------------------------------------------------------------------------------
                            |              Corrected
              CashHoldings1 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    ------------------------+----------------------------------------------------------------
              CashHoldings1 |
                        L1. |   .5456747   .0595597     9.16   0.000     .4287286    .6626207
                            |
                      Size1 |  -.0016775   .0008491    -1.98   0.049    -.0033447   -.0000103
                  Leverage1 |   .0217131   .0190564     1.14   0.255    -.0157043    .0591306
                 Liquidity1 |  -.0248853   .0120496    -2.07   0.039    -.0485448   -.0012257
             Profitability4 |   .1125777   .0334868     3.36   0.001      .046826    .1783294
    GrowthPotential2TobinsQ |  -.0092466   .0050835    -1.82   0.069    -.0192282    .0007349
          OperatingCashflow |    .073043   .0137996     5.29   0.000     .0459473    .1001386
                  Dividend2 |  -.0210518   .0327112    -0.64   0.520    -.0852806    .0431771
        CapitalExpenditure1 |  -.1112026   .0363169    -3.06   0.002    -.1825111    -.039894
         CashFlowVol15years |    .135721   .0454753     2.98   0.003     .0464297    .2250123
         WPromoterSharesin1 |  -.0124934   .0049304    -2.53   0.012    -.0221743   -.0028125
                            |
                       Year |
                      2001  |          0  (empty)
                      2002  |   .0419119   .0326967     1.28   0.200    -.0222884    .1061122
                      2003  |   .0433601   .0324212     1.34   0.182    -.0202994    .1070196
                      2004  |   .0472348   .0324283     1.46   0.146    -.0164385    .1109082
                      2005  |   .0527746   .0326891     1.61   0.107    -.0114108      .11696
                      2006  |   .0579921   .0328381     1.77   0.078    -.0064859    .1224701
                      2007  |   .0584495   .0329269     1.78   0.076    -.0062029    .1231018
                      2008  |   .0486299    .032612     1.49   0.136    -.0154041    .1126639
                      2009  |   .0518934    .032595     1.59   0.112    -.0121073     .115894
                      2010  |   .0532916   .0322942     1.65   0.099    -.0101183    .1167016
                      2011  |   .0455951    .030552     1.49   0.136     -.014394    .1055842
                      2012  |   .0459586    .031354     1.47   0.143    -.0156054    .1075225
                      2013  |   .0408918   .0296378     1.38   0.168    -.0173024    .0990859
                      2014  |   .0441315   .0301533     1.46   0.144    -.0150748    .1033378
                      2015  |   .0482326   .0313953     1.54   0.125    -.0134124    .1098777
                      2016  |   .0482783   .0322953     1.49   0.135    -.0151339    .1116905
                            |
                      _cons |          0  (omitted)
    -----------------------------------------------------------------------------------------
    Instruments for orthogonal deviations equation
      GMM-type (missing=0, separate instruments for each period unless collapsed)
        L(2/5).(Leverage1 Liquidity1 GrowthPotential2TobinsQ Dividend2
        CapitalExpenditure1) collapsed
        L2.L.CashHoldings1
    Instruments for levels equation
      Standard
        Size1 Profitability4 WPromoterSharesin1 CashFlowVol15years
        OperatingCashflow 2001b.Year 2002.Year 2003.Year 2004.Year 2005.Year
        2006.Year 2007.Year 2008.Year 2009.Year 2010.Year 2011.Year 2012.Year
        2013.Year 2014.Year 2015.Year 2016.Year
        _cons
    ------------------------------------------------------------------------------
    Arellano-Bond test for AR(1) in first differences: z =  -7.88  Pr > z =  0.000
    Arellano-Bond test for AR(2) in first differences: z =   1.40  Pr > z =  0.161
    ------------------------------------------------------------------------------
    Sargan test of overid. restrictions: chi2(25)   =  49.40  Prob > chi2 =  0.003
      (Not robust, but not weakened by many instruments.)
    Hansen test of overid. restrictions: chi2(25)   =  32.88  Prob > chi2 =  0.134
      (Robust, but weakened by many instruments.)
    
    Difference-in-Hansen tests of exogeneity of instrument subsets:
      gmm(L.CashHoldings1, eq(diff) lag(2 2))
        Hansen test excluding group:     chi2(12)   =  13.17  Prob > chi2 =  0.357
        Difference (null H = exogenous): chi2(13)   =  19.71  Prob > chi2 =  0.103
      gmm(Leverage1 Liquidity1 GrowthPotential2TobinsQ Dividend2 CapitalExpenditure1, collapse eq(diff) lag
    > (2 5))
        Hansen test excluding group:     chi2(5)    =   8.40  Prob > chi2 =  0.136
        Difference (null H = exogenous): chi2(20)   =  24.48  Prob > chi2 =  0.222
      iv(Size1 Profitability4 WPromoterSharesin1 CashFlowVol15years OperatingCashflow 2001b.Year 2002.Year
    > 2003.Year 2004.Year 2005.Year 2006.Year 2007.Year 2008.Year 2009.Year 2010.Year 2011.Year 2012.Year 2
    > 013.Year 2014.Year 2015.Year 2016.Year, eq(level))
        Hansen test excluding group:     chi2(6)    =   6.29  Prob > chi2 =  0.391
        Difference (null H = exogenous): chi2(19)   =  26.59  Prob > chi2 =  0.115
    Below are my questions:

    1. Is it fine to provide the lag range for lagged dependent variable as (2 2) and for endogenous variables (gmm style) as (2 5)? I ask this because I have arrived at these lag ranges after a lot of experimentation to ensure AR(2) and Hansen tests are within acceptable limits.

    2. Are the results of all Hansen/Sargan tests reported in this output within acceptable limits? I ask this because Roodman (2009) mentions that one should look out for p values of Hansen tests close to 0.25. Moreover, I would also like to know the range for which p values of Hansen tests are deemed ideal.

    3. Does the overall output of the model seem statistically valid? Specifically, is the instrument count within acceptable limits? Are the results for time dummies correct with reference to the bug in xtabond2 which omits/drops time dummies?

    Help regarding the abovementioned issues is highly appreciated. Thanks!

  • #2
    Eagerly waiting for a response!!

    Comment


    • #3

      Any suggestions would be really appreciated! Many thanks.

      Comment


      • #4
        Any inputs are welcome. Thanks!!

        Comment


        • #5
          Waiting for a reply!! Thanks!

          Comment


          • #6
            Any help is more than welcome!!

            Comment


            • #7
              Sincerely awaiting guidance!

              Comment


              • #8
                Any helpful comments are welcome!!

                Comment


                • #9
                  Waiting for any potential guidance.

                  Comment


                  • #10
                    1. The chosen lag range creates valid instruments if there is no serial error correlation. The Arellano-Bond AR(2) test does not reject the latter assumption.

                    2. That is a difficult judgement. At the conventional significance levels, the (Difference-in-)Hansen tests do not reject the null hypothesis. In a strict sense, you are fine. However, some people might remain sceptical given the relatively small p-values. In such a situation, I believe that it is important that you convince the readers of your work that the assumptions you have made are reasonable. That is much more important than further experimentation with the aim to improve the p-values a little bit further. Specification tests (and p-values in general) should not be overrated. I personally do not like to the idea of an "ideal range of p-values". If you clearly reject the null hypothesis, you certainly should worry. If the p-values are essentially 1, you should worry as well (as explained by Roodman).

                    3. Your instrument count is not necessarily too large. Again, it is not easy to define an ideal number of instruments. Note that some of your time dummies are omitted. As I have explained elsewhere on Statalist, there is a bug in xtabond2 in this situation that gives incorrect degrees of freedoms (and therefore also incorrect p-values) for the Hansen tests. In your case, the correct degrees of freedom should be 27 instead of 25. This needs to be fixed first before you can even start looking at the Hansen test p-values. To avoid this problem, I recommend to use my xtdpdgmm command.
                    https://twitter.com/Kripfganz

                    Comment


                    • #11
                      Dear Prof Sebastian:

                      Thanks a lot for your invaluable guidance. Heartfelt gratitude for the same. Further, I have following queries:

                      1. As shown in my output in post #1, the Sargan test's null hypothesis is rejected. I would like to know whether and when we should we worry about the result of this test? (or should we only consider result of Hansen's test for verifying instrument validity?

                      2. Should we also consider the results of "Difference-in-Hansen tests of exogeneity of instrument subsets" and ensure that their p values also lie within acceptable limits? Or is the result of overall Hansen test (mentioned at the very beginning) enough to examine instrument validity?

                      3. Keeping in view your suggestion to use xtdpdgmm, I would like to use the same. However, I could not replicate my results shown in post #1. I used the following command and received the following error message.

                      Code:
                      xtdpdgmm CashHoldings1 L.CashHoldings1 Size1 Leverage1 Liquidity1 Profitability4 GrowthPotential2 OperatingCashflow Dividend2 CapitalExpenditure1 CashFlowVol15years WPromoterSharesin1, gmmiv(CashHoldings1, lag(2 2) diff model(fodev)) gmmiv(Leverage1 Liquidity1 GrowthPotential2 Dividend2 CapitalExpenditure1, lag(2 5) diff collapse model(fodev)) iv(Size1 Profitability4  WPromoterSharesin1 CashFlowVol15years OperatingCashflow, lag () diff model(fodev)) twostep vce(cluster CompanyID)
                      Code:
                      Generalized method of moments estimation
                      not sorted
                                 editmissing():  3598  Stata returned error
                            xtdpdgmm_opt::iv():     -  function returned error
                          xtdpdgmm_opt::init():     -  function returned error
                                    xtdpdgmm():     -  function returned error
                                       <istmt>:     -  function returned error
                      r(3598);
                      I humbly request you to help me replicate my results in post #1 using xtdpdgmm.

                      Thanks!

                      Comment


                      • #12
                        1. After the two-step estimator, the Sargan test is not really relevant. It only matters after the one-step estimator if you are assuming that the one-step weighting matrix is already optimal.

                        2. Ideally, the overall Hansen test and all Difference-in-Hansen tests should not reject the null hypothesis. If the overall Hansen test rejects the null hypothesis, then the Difference-in-Hansen tests could be useful to identify potentially troublesome instruments. If the overall Hansen test does not reject the null hypothesis, then a rejection of any of the Difference-in-Hansen tests would be a contradiction. Yet, in practice this can happen. It is then up to you to make a judgement whether you believe the overall Hansen test. If your overall Hansen test just marginally not rejects, you might still worry and reconsider whether the instruments in question can be justified.

                        3. Many thanks for flagging this bug in my xtdpdgmm program. I have now fixed the problem. Please update the program and try again.
                        https://twitter.com/Kripfganz

                        Comment


                        • #13
                          Thanks a ton Prof. Sebastian for your brilliant guidance once again. I updated the command and ran the following code.

                          Code:
                           
                           xtdpdgmm CashHoldings1 L.CashHoldings1 Size1 Leverage1 Liquidity1 Profitability4 GrowthPotential2 OperatingCashflow Dividend2 CapitalExpenditure1 CashFlowVol15years WPromoterSharesin1, gmmiv(CashHoldings1, lag(2 2) diff model(fodev)) gmmiv(Leverage1 Liquidity1 GrowthPotential2 Dividend2 CapitalExpenditure1, lag(2 5) diff collapse model(fodev)) iv(Size1 Profitability4  WPromoterSharesin1 CashFlowVol15years OperatingCashflow, lag () diff model(fodev)) twostep vce(cluster CompanyID)
                          As per my limited understanding, I tried to replicate the output (or rather command) mentioned in post #1. However, the output I received (mentioned below) is far from that provided in post #1.

                          Code:
                          Generalized method of moments estimation
                          
                          Step 1         f(b) =  .00009857
                          Step 2         f(b) =  .04489694
                          
                          Group variable: CompanyID                    Number of obs         =      6991
                          Time variable: Year                          Number of groups      =       671
                          
                          Moment conditions:     linear =      38      Obs per group:    min =         1
                                              nonlinear =       0                        avg =  10.41878
                                                  total =      38                        max =        15
                          
                                                                 (Std. Err. adjusted for 671 clusters in CompanyID)
                          -----------------------------------------------------------------------------------------
                                                  |              WC-Robust
                                    CashHoldings1 |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
                          ------------------------+----------------------------------------------------------------
                                    CashHoldings1 |
                                              L1. |   .3752041   .0884388     4.24   0.000     .2018672     .548541
                                                  |
                                            Size1 |   -.003643   .0121972    -0.30   0.765    -.0275491    .0202632
                                        Leverage1 |  -.1000797   .0580081    -1.73   0.084    -.2137736    .0136141
                                       Liquidity1 |   .0517889   .0485204     1.07   0.286    -.0433094    .1468871
                                   Profitability4 |   .0919613   .0603426     1.52   0.128    -.0263081    .2102306
                          GrowthPotential2TobinsQ |  -.0266977   .0130475    -2.05   0.041    -.0522703    -.001125
                                OperatingCashflow |    .054695   .0168896     3.24   0.001     .0215921    .0877979
                                        Dividend2 |   .1130135   .0576774     1.96   0.050    -.0000322    .2260592
                              CapitalExpenditure1 |   .0091074    .085736     0.11   0.915    -.1589321    .1771468
                               CashFlowVol15years |  -.0316621   .1647978    -0.19   0.848    -.3546598    .2913357
                               WPromoterSharesin1 |   .0957181   .1081842     0.88   0.376    -.1163191    .3077553
                                            _cons |  -.0780799   .1257178    -0.62   0.535    -.3244823    .1683224
                          -----------------------------------------------------------------------------------------
                          
                          . help xtdpdgmm
                          
                          . estat serial
                          
                          Arellano-Bond test for autocorrelation of the first-differenced residuals
                          H0: no autocorrelation of order 1:     z =   -6.8619   Prob > |z|  =    0.0000
                          H0: no autocorrelation of order 2:     z =   -0.3940   Prob > |z|  =    0.6936
                          
                          . estat overid
                          
                          Sargan-Hansen test of the overidentifying restrictions
                          H0: overidentifying restrictions are valid
                          
                          2-step moment functions, 2-step weighting matrix       chi2(26)    =   30.1258
                                                                                 Prob > chi2 =    0.2624
                          
                          2-step moment functions, 3-step weighting matrix       chi2(26)    =   24.8519
                                                                                 Prob > chi2 =    0.5274
                          Since I am new to xtdpdgmm, I request you to guide me in order to obtain the same output as mentioned in post #1 so that I can use and report results according to xtdpdgmm.

                          Thanks and Regards
                          Prateek

                          Comment


                          • #14
                            I recommend to first simplify the model by excluding some regressors. (You can add them again later once you have figured out how to specify the respective options.) This makes it easier to compare the commands.

                            In your xtdpdgmm specification, you currently do not have time effects. You need to add them by specifying the option teffects.

                            xtdpdgmm does not do a degrees-of-freedom correction for the standard errors. For comparison purposes, you should remove the small option from the xtabond2 command.

                            I have just notices that you want to use forward-orthogonal deviations. These are implemented in xtabond2 in a quite problematic way; see here for some details. In fact, I do not think that it is possible to obtain the correct specification of the system GMM estimator with forward-orthogonal deviations in xtabond2. (You can call it a bug, if you want.) Without the time dummies and the level instruments, the following two specifications are equivalent:
                            Code:
                            webuse abdata
                            xtabond2 n L.n k, twostep robust orthogonal gmm(L.n, lag(1 1) eq(d)) gmm(k, lag(2 5) eq(d) collapse)
                            xtdpdgmm n L.n k, gmmiv(L.n, lag(0 0) model(fodev)) gmmiv(k, lag(1 4) collapse model(fodev)) twostep vce(r)
                            where k is treated as an endogenous regressor. Notice that the validity of lags as instruments differs when using forward-orthogonal deviations compared to first differences. The first lag of an endogenous variable is already a valid instrument when using forward-orthogonal deviations while it is only the second lag when using first differences.

                            The fact that the instruments need to be shifted by one period in the xtabond2 specification to achieve the same result is the cause of all the trouble. If you now want to add instruments for a level equation, say by including time dummies:
                            Code:
                            xtabond2 n L.n k yr1978-yr1984, twostep robust orthogonal gmm(L.n, lag(1 1) eq(d)) gmm(k, lag(2 5) eq(d) collapse) iv(yr1978-yr1984, eq(l))
                            xtdpdgmm n L.n k yr1978-yr1984, gmmiv(L.n, lag(0 0) model(fodev)) gmmiv(k, lag(1 4) collapse model(fodev)) iv(yr1978-yr1984, model(level)) twostep vce(r)
                            you will notice that the results are suddenly different. As far as I can see, there is no way to fix this problem with xtabond2, and only xtdpdgmm delivers the "correct" results.
                            https://twitter.com/Kripfganz

                            Comment


                            • #15
                              Dear Prof. Sebastian:

                              With all your help and support, I have been able to generate results for my model using xtdpdgmm. Below is the final command and corresponding output.

                              Code:
                              xtdpdgmm CashHoldings1 L.CashHoldings1 Size1 Leverage1 Liquidity1 Profitability4 GrowthPotential2 OperatingCashflow Dividend2 CapitalExpenditure1 CashFlowVol15years WPromoterSharesin1, teffects twostep vce(cluster CompanyID) gmmiv(L.CashHoldings1, lag(1 1) model(fodev)) gmmiv(Leverage1 Liquidity1 GrowthPotential2 Dividend2 CapitalExpenditure1, lag(1 4) collapse model(fodev)) iv(Year* Size1 Profitability4  WPromoterSharesin1 CashFlowVol15years OperatingCashflow, model(level))
                              Code:
                              Generalized method of moments estimation
                              
                              Step 1         f(b) =  .00010213
                              Step 2         f(b) =  .04338797
                              
                              Group variable: CompanyID                    Number of obs         =      6991
                              Time variable: Year                          Number of groups      =       671
                              
                              Moment conditions:     linear =      53      Obs per group:    min =         1
                                                  nonlinear =       0                        avg =  10.41878
                                                      total =      53                        max =        15
                              
                                                                     (Std. Err. adjusted for 671 clusters in CompanyID)
                              -----------------------------------------------------------------------------------------
                                                      |              WC-Robust
                                        CashHoldings1 |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
                              ------------------------+----------------------------------------------------------------
                                        CashHoldings1 |
                                                  L1. |   .5405655   .0639851     8.45   0.000      .415157     .665974
                                                      |
                                                Size1 |  -.0015902   .0008944    -1.78   0.075    -.0033433    .0001628
                                            Leverage1 |   .0215192   .0201021     1.07   0.284    -.0178801    .0609186
                                           Liquidity1 |  -.0271283   .0128011    -2.12   0.034    -.0522179   -.0020386
                                       Profitability4 |   .1141001   .0356895     3.20   0.001       .04415    .1840502
                              GrowthPotential2TobinsQ |   -.008502   .0056203    -1.51   0.130    -.0195176    .0025135
                                    OperatingCashflow |   .0686659   .0150234     4.57   0.000     .0392205    .0981113
                                            Dividend2 |  -.0413027   .0341163    -1.21   0.226    -.1081693    .0255639
                                  CapitalExpenditure1 |  -.1140552   .0383933    -2.97   0.003    -.1893047   -.0388057
                                   CashFlowVol15years |   .1400365   .0454519     3.08   0.002     .0509524    .2291206
                                   WPromoterSharesin1 |  -.0117863   .0049264    -2.39   0.017    -.0214418   -.0021307
                                                      |
                                                 Year |
                                                2003  |   .0011026   .0024876     0.44   0.658     -.003773    .0059782
                                                2004  |   .0058371   .0025675     2.27   0.023     .0008049    .0108692
                                                2005  |   .0107452   .0036095     2.98   0.003     .0036707    .0178196
                                                2006  |   .0156361   .0041099     3.80   0.000     .0075809    .0236914
                                                2007  |   .0160535   .0043826     3.66   0.000     .0074637    .0246432
                                                2008  |    .007216   .0036688     1.97   0.049     .0000252    .0144068
                                                2009  |   .0093597   .0033894     2.76   0.006     .0027166    .0160028
                                                2010  |   .0112303   .0037242     3.02   0.003     .0039311    .0185295
                                                2011  |    .001814     .00448     0.40   0.686    -.0069666    .0105945
                                                2012  |   .0030444   .0041776     0.73   0.466    -.0051434    .0112323
                                                2013  |  -.0029294    .004528    -0.65   0.518    -.0118041    .0059453
                                                2014  |   .0000594   .0051549     0.01   0.991    -.0100439    .0101628
                                                2015  |   .0050484   .0057278     0.88   0.378    -.0061779    .0162746
                                                2016  |   .0054216   .0066024     0.82   0.412    -.0075188    .0183621
                                                      |
                                                _cons |   .0616041    .033027     1.87   0.062    -.0031275    .1263358
                              -----------------------------------------------------------------------------------------
                              
                              . estat serial
                              
                              Arellano-Bond test for autocorrelation of the first-differenced residuals
                              H0: no autocorrelation of order 1:     z =   -7.6521   Prob > |z|  =    0.0000
                              H0: no autocorrelation of order 2:     z =    1.4230   Prob > |z|  =    0.1547
                              
                              . estat overid
                              
                              Sargan-Hansen test of the overidentifying restrictions
                              H0: overidentifying restrictions are valid
                              
                              2-step moment functions, 2-step weighting matrix       chi2(27)    =   29.1133
                                                                                     Prob > chi2 =    0.3554
                              
                              2-step moment functions, 3-step weighting matrix       chi2(27)    =   29.2938
                                                                                     Prob > chi2 =    0.3468
                              As you said, the above-mentioned results are not identical to those obtained from xtabond2 but are very similar indeed. Below are my follow-up queries.

                              1. Does the overall command and output of the model mentioned above seem statistically and technically valid? Since I am new to xtdpdgmm, I just want to confirm that I am not missing any crucial sub-option and that there is nothing inconsistent/fallacious in the specification of my model.

                              2. Regarding post estimation diagnostics, I suppose Arellano-Bond test for autocorrelation and Sargan-Hansen test is supporting the specification of the model. Is there anything else that we need to care about in post estimation diagnostics?

                              3. Regarding the "Sargan-Hansen test of the overidentifying restrictions", what is the difference between 2-step and 3-step weighting matrix? Which one should be considered and reported in paper?

                              4. How did you come to know that degrees of freedom reported in output shown in post #1 should be 27 instead of 25? How do check degrees of freedom in xtdpdgmm output?

                              5. Just a general query: If xtabond2 has some bugs (as you explained in your posts), I wonder why the command is still the primary choice for system GMM and why it is so widely used and reported in papers? Also, is there any specific reason as to why it has not been updated to correct these bugs?

                              Thanks and Regards
                              Prateek
                              Last edited by Prateek Bedi; 27 Apr 2019, 06:02.

                              Comment

                              Working...
                              X