Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Am I correct with the tow step GMM syntax?

    Hello dear Stata users,

    I am doing research on working capital management and firm performance with two-by using the xtabond2 command in Stata. I am going to publish syntax and results of it below.

    Question 1. I would ask, whether I can do changes in the lag of instrumental variables when I do regression for "inventory days" and "accounts payables" independent variables and with the different dependent variables (for ex. TobinsQ) as long as I get appropriate results of Sargan/Hansen tests?

    Question 2. Can you please explain what does eq(diff) and eq(level) give us? - Sorry, but I could not get clear answers for those.

    Question 3. What is the lowest number for chi.sq for Hansen/Sargan?

    ARD is Accounts Receivables Days and ARDsq is the square of ARD

    xtabond2 ROA l.ROA ARD ARDsq TANG CR SALES LEV GR y* industry*, gmm(l.ROA, l(2 2)) iv(ARD ARDsq TANG CR SALES LEV GR, eq(diff)) iv(ARD ARDsq TANG CR SALES LEV GR y*, eq(level)) nodiffsargan twostep robust

    Favoring space over speed. To switch, type or click on mata: mata set matafavor speed, perm.
    year1 dropped due to collinearity
    year6 dropped due to collinearity
    industry6 dropped due to collinearity
    industry8 dropped due to collinearity

    Warning: Two-step estimated covariance matrix of moments is singular.
    Using a generalized inverse to calculate optimal weighting matrix for two-step estimation.

    Dynamic panel-data estimation, two-step system GMM
    ------------------------------------------------------------------------------
    Group variable: id Number of obs = 1638
    Time variable : Year Number of groups = 227
    Number of instruments = 40 Obs per group: min = 1
    Wald chi2(26) = 5520.06 avg = 7.22
    Prob > chi2 = 0.000 max = 10
    ------------------------------------------------------------------------------
    | Corrected
    ROA | Coef. Std. Err. z P>|z| [95% Conf. Interval]
    -------------+----------------------------------------------------------------
    ROA Coef. St.Err. --- t-value --- p-value [95% Conf Interval] Sig
    L.ROA 0.720 0.145 4.95 0.000 0.435 1.004 ***
    ARD -0.003 0.001 -3.49 0.000 -0.005 -0.001 ***
    ARDsq 0.000 0.000 2.29 0.022 0.000 0.000 **
    TANG -0.184 0.105 -1.76 0.079 -0.389 0.021 *
    CR -0.014 0.013 -1.06 0.289 -0.039 0.012
    SALES -0.022 0.061 -0.36 0.717 -0.143 0.098
    LEV -0.208 0.106 -1.97 0.049 -0.415 -0.001 **
    GR 0.218 0.039 5.54 0.000 0.141 0.296 ***
    year2 0.037 0.035 1.05 0.293 -0.032 0.106
    year3 0.045 0.033 1.36 0.173 -0.020 0.109
    year4 0.004 0.018 0.21 0.831 -0.031 0.039
    year5 0.055 0.018 3.04 0.002 0.019 0.090 ***
    year7 0.031 0.021 1.52 0.130 -0.009 0.072
    year8 0.028 0.023 1.21 0.225 -0.017 0.073
    year9 0.067 0.024 2.76 0.006 0.019 0.114 ***
    year10 -0.023 0.030 -0.76 0.448 -0.081 0.036
    year11 -0.035 0.031 -1.16 0.248 -0.095 0.025
    industry1 -0.096 2.379 -0.04 0.968 -4.760 4.568
    industry2 -1.574 1.544 -1.02 0.308 -4.600 1.452
    industry3 -0.144 0.814 -0.18 0.859 -1.740 1.452
    industry4 -0.538 0.843 -0.64 0.524 -2.190 1.115
    industry5 -0.689 0.774 -0.89 0.373 -2.205 0.827
    industry7 -1.025 0.675 -1.52 0.129 -2.348 0.297
    industry9 0.225 0.234 0.96 0.336 -0.234 0.683
    industry10 0.437 0.664 0.66 0.510 -0.864 1.738
    industry11 -0.112 0.338 -0.33 0.739 -0.774 0.549
    Constant 0.750 0.516 1.45 0.147 -0.262 1.762
    Mean dependent var 0.954 SD dependent var 0.617
    Number of obs 1638.000 Chi-square 5520.064
    ------------------------------------------------------------------------------
    Instruments for first differences equation
    Standard
    D.(ARD ARDsq TANG CR SALES LEV GR)
    GMM-type (missing=0, separate instruments for each period unless collapsed)
    L2.L.ROA
    Instruments for levels equation
    Standard
    ARD ARDsq TANG CR SALES LEV GR year1 year2 year3 year4 year5 year6 year7
    year8 year9 year10 year11
    _cons
    GMM-type (missing=0, separate instruments for each period unless collapsed)
    DL.L.ROA
    ------------------------------------------------------------------------------
    Arellano-Bond test for AR(1) in first differences: z = -3.96 Pr > z = 0.000
    Arellano-Bond test for AR(2) in first differences: z = 0.43 Pr > z = 0.668
    ------------------------------------------------------------------------------
    Sargan test of overid. restrictions: chi2(13) = 31.50 Prob > chi2 = 0.003
    (Not robust, but not weakened by many instruments.)
    Hansen test of overid. restrictions: chi2(13) = 11.11 Prob > chi2 = 0.602
    (Robust, but weakened by many instruments.)
    Last edited by Avaz Yusibov; 02 Nov 2021, 02:56.

  • #2
    1. I recommend to follow a structured approach rather than arbitrarily varying lags in the quest for the "best results". It is usually a good idea to not just restrict the instruments for the first-differenced model to a single lag but instead use at least a few more lags; say, lag(2 5) for example.
    2. eq(diff) specifies instruments for the first-differenced model, eq(level) specifies instruments for the untransformed model in levels.
    3. Not sure what you mean by this. You would want to not reject the null hypothesis of the Sargan-Hansen test, thus you would like to have a p-value larger than your chosen significance level. To be on the safe side, you may want to set a higher significance level than you would normally do.
    As an additional observation: You implicitly assumed that all variables in iv(ARD ARDsq TANG CR SALES LEV GR y*, eq(level)) are uncorrelated with the unobserved "fixed effects". This may or may not be a reasonable assumption. Often, you would want to use first differences of (some of) those variables as instruments for the level model.

    The following presentation slides might be helpful:
    https://www.kripfganz.de/stata/

    Comment


    • #3
      Thank you very much for replying to me. How would your change the formula? Can you please re-arrange the model equation, please? Because I misunderstand the level and difference equation here.

      This is my syntax. Can you please re-arrange the model with your own thoughts, please? example is better to understand for me.

      xtabond2 ROA l.ROA ARD ARDsq TANG CR SALES LEV GR y* industry*, gmm(l.ROA, l(2 2)) iv(ARD ARDsq TANG CR SALES LEV GR, eq(diff)) iv(ARD ARDsq TANG CR SALES LEV GR y*, eq(level)) nodiffsargan twostep robust

      Comment


      • #4
        Before you proceed, I recommend that you get a better understanding of the estimators by reading some background literature about the difference and system GMM estimator for dynamic panel data models. My presentation slides above contain several references, for example Roodman's 2009 Stata Journal article.

        In a first step, you then need to decide whether you want to classify your variables as endogenous, predetermined, or strictly exogenous with respect to the idiosyncratic error component. In addition, besides year and industry dummies, it is usually advisable to treat all variables as potentially correlated with the time-invariant unit-specific error component. Please see again my presentation slides and the references therein. Assuming that all variables (besides year and industry dummies) are endogenous, you could set up the following system GMM implementation:
        Code:
        xtdpdgmm ROA L.ROA ARD ARDsq TANG CR SALES LEV GR y* industry*, gmm(L.ROA, lag(1 4) model(diff)) gmm(ARD ARDsq TANG CR SALES LEV GR, lag(2 5) model(diff)) gmm(L.ROA, lag(0 0) diff model(level)) gmm(ARD ARDsq TANG CR SALES LEV GR, lag(1 1) diff model(level)) iv(y* industry*, model(level)) collapse twostep vce(robust) small
        For details about the syntax and options, please consult again my presentation slides and the command help file.
        https://www.kripfganz.de/stata/

        Comment


        • #5
          Hello Dr Sebastian.

          Thank you very much again for your initiative to help me. I would ask a question about chi square of the Hanse-Sargan tests. How much min and max they should be? I have read your ppt file and Roodman(2009). The issue is that, I have watched several videos in youtube, and read lots of comments in statalist and none of them does not use the same methods. That confuses me a lot. I know, xtdpdgmm is used for non-linear models, but some papers related to WCM and firm performance research, they have used xtabond2 code. For example; this paper mentions in this way: "All specifications of Eq. (10) are estimated with the GMM estimator system, using the Stata command xtabond2 [47]. In particular, we consider the right-side variables as endogenous variables and use their lags from t-2 to t-3 as instruments for the equations in differences, and the lagged first-differenced endogenous regressors as instruments for the level equations. In contrast, time dummies are considered to be exoge- nous variables." If possible, can you please write the syntax for this type of explanation.

          Your above code gave me this result.


          Code:
          Generalized method of moments estimation
          
          Fitting full model:
          Step 1         f(b) =  .01483229
          Step 2         f(b) =  .16030475
          
          Group variable: id                           Number of obs         =      1638
          Time variable: Year                          Number of groups      =       227
          
          Moment conditions:     linear =      59      Obs per group:    min =         1
                              nonlinear =       0                        avg =  7.215859
                                  total =      59                        max =        10
          
                                             (Std. Err. adjusted for 227 clusters in id)
          ------------------------------------------------------------------------------
                       |              WC-Robust
                   ROA |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
          -------------+----------------------------------------------------------------
                   ROA |
                   L1. |    .428002   .1914573     2.24   0.026     .0507324    .8052717
                       |
                   ARD |  -.0005593    .001533    -0.36   0.716      -.00358    .0024615
                 ARDsq |  -1.34e-06   5.29e-06    -0.25   0.801    -.0000118    9.08e-06
                  TANG |  -.2197456   .1871677    -1.17   0.242    -.5885625    .1490713
                    CR |  -.0246327   .0291492    -0.85   0.399    -.0820716    .0328062
                 SALES |   .1491611     .05595     2.67   0.008     .0389108    .2594114
                   LEV |  -.4005525   .2633979    -1.52   0.130    -.9195824    .1184773
                    GR |    .142973   .0890178     1.61   0.110     -.032438     .318384
                 year1 |          0  (omitted)
                 year2 |   .1039732   .0365318     2.85   0.005     .0319867    .1759596
                 year3 |   .0881445   .0312371     2.82   0.005     .0265913    .1496977
                 year4 |   .0605548   .0330754     1.83   0.068    -.0046208    .1257305
                 year5 |   .0782326   .0303052     2.58   0.010     .0185156    .1379496
                 year6 |   .0349172   .0306876     1.14   0.256    -.0255532    .0953876
                 year7 |   .0291707   .0299506     0.97   0.331    -.0298475    .0881889
                 year8 |   .0286389   .0288672     0.99   0.322    -.0282444    .0855223
                 year9 |   .0630019   .0283416     2.22   0.027     .0071543    .1188494
                year10 |          0  (omitted)
                year11 |  -.0736073   .0243996    -3.02   0.003     -.121687   -.0255276
             industry1 |  -.1519642   .1773551    -0.86   0.392    -.5014452    .1975169
             industry2 |  -.2565257   .2120583    -1.21   0.228      -.67439    .1613386
             industry3 |  -.3745091   .2134513    -1.75   0.081    -.7951183    .0461002
             industry4 |  -.1496469    .184586    -0.81   0.418    -.5133765    .2140827
             industry5 |  -.3514499   .2016478    -1.74   0.083    -.7488002    .0459003
             industry6 |  -.1367069    .140865    -0.97   0.333    -.4142836    .1408698
             industry7 |  -.4943237   .3267777    -1.51   0.132    -1.138244     .149597
             industry8 |          0  (omitted)
             industry9 |          0  (omitted)
            industry10 |  -.3355815   .1898879    -1.77   0.079    -.7097587    .0385956
            industry11 |   .1271968   .1690844     0.75   0.453    -.2059868    .4603804
                 _cons |  -.4184963   .4146979    -1.01   0.314    -1.235665    .3986726
          ------------------------------------------------------------------------------
          Instruments corresponding to the linear moment conditions:
           1, model(diff):
             L1.L.ROA L2.L.ROA L3.L.ROA L4.L.ROA
           2, model(diff):
             L2.ARD L3.ARD L4.ARD L5.ARD L2.ARDsq L3.ARDsq L4.ARDsq L5.ARDsq L2.TANG
             L3.TANG L4.TANG L5.TANG L2.CR L3.CR L4.CR L5.CR L2.SALES L3.SALES L4.SALES
             L5.SALES L2.LEV L3.LEV L4.LEV L5.LEV L2.GR L3.GR L4.GR L5.GR
           3, model(level):
             D.L.ROA
           4, model(level):
             L1.D.ARD L1.D.ARDsq L1.D.TANG L1.D.CR L1.D.SALES L1.D.LEV L1.D.GR
           5, model(level):
             year2 year3 year4 year5 year6 year7 year8 year9 year10 industry2 industry3
             industry4 industry5 industry6 industry7 industry9 industry10 industry11
           6, model(level):
             _cons
          
          . 
          
          . estat overid
          
          Sargan-Hansen test of the overidentifying restrictions
          H0: overidentifying restrictions are valid
          
          2-step moment functions, 2-step weighting matrix       chi2(32)    =   36.3892
                                                                 Prob > chi2 =    0.2716
          
          2-step moment functions, 3-step weighting matrix       chi2(32)    =   47.8251
                                                                 Prob > chi2 =    0.0357

          Last edited by Avaz Yusibov; 04 Nov 2021, 23:30.

          Comment


          • #6
            I believe all you need to do is changing lag(2 5) into lag(2 3) in order to use "their lags from t-2 to t-3 as instruments". xtdpdgmm can do almost everything you can do with xtabond2 and a few more things.

            Regarding the p-value of the Hansen test, there are no established thresholds. The recent paper by Kiviet (2020, Econometrics and Statistics) might provide some insights.
            https://www.kripfganz.de/stata/

            Comment


            • #7
              Thank you very much Dr. Sebastian for your great help. I appreciate your warm help!

              Comment


              • #8
                I would have another question related to GMM. How can I use the formula to control for unobservable heterogeneity? Can I go ahead with two-step System GMM if the independent variables are Heteroskedastic? Does robust command is remedy for heterogeneity? How can I add cluster(id) into the formula?
                Last edited by Avaz Yusibov; 20 Nov 2021, 00:33.

                Comment


                • #9
                  Heteroskedasticity and heterogeneity are very different concepts. I am not sure what you have in mind. Also, heteroskedastic independent variables are not a reason for concern. We are usually concerned about heteroskedasticity in the error term (which may be functionally related to the independent variables). Two-step robust standard errors account for heteroskedastic errors. Given that id is your panel identifier, vce(robust) is identical to vce(cluster id).
                  https://www.kripfganz.de/stata/

                  Comment


                  • #10
                    I mean unobservable heterogeneity above

                    xtabond2 ROA l.ROA ARD ARDsq TANG CR SALES LEV GR y* industry*, gmm(l.ROA ARD ARDsq TANG CR SALES LEV GR, eq(diff) collapse l(2 3)) iv(l(2 2).(l.ROA ARD ARDsq TANG CR SALES LEV GR) y* industry*, eq(level)) nodiffsargan twostep robust small orthogonal

                    xtabond2 ROA l.ROA ARD ARDsq TANG CR SALES LEV GR y* industry*, gmm(l.ROA ARD ARDsq TANG CR SALES LEV GR, eq(diff) collapse l(2 5)) iv(l(2 2).(l.ROA ARD ARDsq TANG CR SALES LEV GR) y* industry*, eq(level)) nodiffsargan twostep robust small

                    Which one of those can be right commands? I have unbalanced data and I know that the orthogonality condition fits for unbalanced data. I have taken all variables as endogenous in line with previous study written in this form: "All specifications of Eq. (10) are estimated with the GMM estimator system [3], using the Stata command xtabond2 [47]. In particular, we consider the right-side variables as endogenous variables and use their lags from t-2 to t-3 as instruments for the equations in differences, and the lagged first-differenced endogenous regressors as instruments for the level equations. In contrast, time dummies are considered to be exogenous"

                    I have posted it once above, I did the same way, but could not get a normal result. Can you please help me with that. Are the above commands okay if other resuls (J statistics, number of groups over than instruments, and others) are okay? Also can you please add cluster id command? I could not use it in stata. \

                    Thanks in advance
                    Last edited by Avaz Yusibov; 26 Nov 2021, 06:53.

                    Comment


                    • #11
                      Neither of those commands will be right. Aside from the time dummies, the instruments specified with option iv(l(2 2).(l.ROA ARD ARDsq TANG CR SALES LEV GR) y* industry*, eq(level)) are invalid if there is unobserved heterogeneity. The iv() option does not automatically create first differences of the instruments for the level model. I recommend to use the gmm() option with suboption collapse instead.

                      In accordance with the quoted statement, you may try the following:
                      Code:
                      xtdpdgmm ROA l.ROA ARD ARDsq TANG CR SALES LEV GR y* industry*, gmm(l.ROA, m(diff) collapse l(1 2)) gmm(ARD ARDsq TANG CR SALES LEV GR, m(diff) collapse l(2 3)) gmm(l.ROA, m(level) collapse l(0 0)) gmm(ARD ARDsq TANG CR SALES LEV GR, m(level) collapse l(1 1)) iv(y* industry*, m(level)) twostep vce(cluster id) small
                      If you want to use orthogonal deviations, the command differs slightly:
                      Code:
                      xtdpdgmm ROA l.ROA ARD ARDsq TANG CR SALES LEV GR y* industry*, gmm(l.ROA, m(fod) collapse l(0 1)) gmm(ARD ARDsq TANG CR SALES LEV GR, m(fod) collapse l(1 2)) gmm(l.ROA, m(level) collapse l(0 0)) gmm(ARD ARDsq TANG CR SALES LEV GR, m(level) collapse l(1 1)) iv(y* industry*, m(level)) twostep vce(cluster id) small
                      (Note that the lag specifications are different when using orthogonal deviations with xtdpdgmm compared to xtabond2.)
                      https://www.kripfganz.de/stata/

                      Comment


                      • #12
                        I tried but it did not give good result.If possible, can you please write the syntax for xtabond2? some papers have used it, not xtdpdgmm. But they do not expose how they have written

                        Comment


                        • #13
                          If you choose the same instruments in xtabond2, you will receive the same results as with xtdpdgmm. Just switching between these commands does not give you "better" results. What do you mean by your statement that the results are "not good"?
                          https://www.kripfganz.de/stata/

                          Comment

                          Working...
                          X