Am I correct with the tow step GMM syntax?

Avaz Yusibov

Join Date: Sep 2021
Posts: 8

Am I correct with the tow step GMM syntax?

02 Nov 2021, 02:47

Hello dear Stata users,

I am doing research on working capital management and firm performance with two-by using the xtabond2 command in Stata. I am going to publish syntax and results of it below.

Question 1. I would ask, whether I can do changes in the lag of instrumental variables when I do regression for "inventory days" and "accounts payables" independent variables and with the different dependent variables (for ex. TobinsQ) as long as I get appropriate results of Sargan/Hansen tests?

Question 2. Can you please explain what does eq(diff) and eq(level) give us? - Sorry, but I could not get clear answers for those.

Question 3. What is the lowest number for chi.sq for Hansen/Sargan?

ARD is Accounts Receivables Days and ARDsq is the square of ARD

xtabond2 ROA l.ROA ARD ARDsq TANG CR SALES LEV GR y* industry*, gmm(l.ROA, l(2 2)) iv(ARD ARDsq TANG CR SALES LEV GR, eq(diff)) iv(ARD ARDsq TANG CR SALES LEV GR y*, eq(level)) nodiffsargan twostep robust

Favoring space over speed. To switch, type or click on mata: mata set matafavor speed, perm.
year1 dropped due to collinearity
year6 dropped due to collinearity
industry6 dropped due to collinearity
industry8 dropped due to collinearity

Warning: Two-step estimated covariance matrix of moments is singular.
Using a generalized inverse to calculate optimal weighting matrix for two-step estimation.

Dynamic panel-data estimation, two-step system GMM
------------------------------------------------------------------------------
Group variable: id Number of obs = 1638
Time variable : Year Number of groups = 227
Number of instruments = 40 Obs per group: min = 1
Wald chi2(26) = 5520.06 avg = 7.22
Prob > chi2 = 0.000 max = 10
------------------------------------------------------------------------------
| Corrected
ROA | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------

ROA	Coef.	St.Err.		---	t-value ---		p-value	[95% Conf		Interval]	Sig
L.ROA	0.720	0.145			4.95		0.000	0.435		1.004	***
ARD	-0.003	0.001			-3.49		0.000	-0.005		-0.001	***
ARDsq	0.000	0.000			2.29		0.022	0.000		0.000	**
TANG	-0.184	0.105			-1.76		0.079	-0.389		0.021	*
CR	-0.014	0.013			-1.06		0.289	-0.039		0.012
SALES	-0.022	0.061			-0.36		0.717	-0.143		0.098
LEV	-0.208	0.106			-1.97		0.049	-0.415		-0.001	**
GR	0.218	0.039			5.54		0.000	0.141		0.296	***
year2	0.037	0.035			1.05		0.293	-0.032		0.106
year3	0.045	0.033			1.36		0.173	-0.020		0.109
year4	0.004	0.018			0.21		0.831	-0.031		0.039
year5	0.055	0.018			3.04		0.002	0.019		0.090	***
year7	0.031	0.021			1.52		0.130	-0.009		0.072
year8	0.028	0.023			1.21		0.225	-0.017		0.073
year9	0.067	0.024			2.76		0.006	0.019		0.114	***
year10	-0.023	0.030			-0.76		0.448	-0.081		0.036
year11	-0.035	0.031			-1.16		0.248	-0.095		0.025
industry1	-0.096	2.379			-0.04		0.968	-4.760		4.568
industry2	-1.574	1.544			-1.02		0.308	-4.600		1.452
industry3	-0.144	0.814			-0.18		0.859	-1.740		1.452
industry4	-0.538	0.843			-0.64		0.524	-2.190		1.115
industry5	-0.689	0.774			-0.89		0.373	-2.205		0.827
industry7	-1.025	0.675			-1.52		0.129	-2.348		0.297
industry9	0.225	0.234			0.96		0.336	-0.234		0.683
industry10	0.437	0.664			0.66		0.510	-0.864		1.738
industry11	-0.112	0.338			-0.33		0.739	-0.774		0.549
Constant	0.750	0.516			1.45		0.147	-0.262		1.762

Mean dependent var			0.954			SD dependent var			0.617
Number of obs			1638.000			Chi-square			5520.064

------------------------------------------------------------------------------
Instruments for first differences equation
Standard
D.(ARD ARDsq TANG CR SALES LEV GR)
GMM-type (missing=0, separate instruments for each period unless collapsed)
L2.L.ROA
Instruments for levels equation
Standard
ARD ARDsq TANG CR SALES LEV GR year1 year2 year3 year4 year5 year6 year7
year8 year9 year10 year11
_cons
GMM-type (missing=0, separate instruments for each period unless collapsed)
DL.L.ROA
------------------------------------------------------------------------------
Arellano-Bond test for AR(1) in first differences: z = -3.96 Pr > z = 0.000
Arellano-Bond test for AR(2) in first differences: z = 0.43 Pr > z = 0.668
------------------------------------------------------------------------------
Sargan test of overid. restrictions: chi2(13) = 31.50 Prob > chi2 = 0.003
(Not robust, but not weakened by many instruments.)
Hansen test of overid. restrictions: chi2(13) = 11.11 Prob > chi2 = 0.602
(Robust, but weakened by many instruments.)

Last edited by Avaz Yusibov; 02 Nov 2021, 02:56.

Tags: gmm, panel data, regression

Sebastian Kripfganz

Join Date: May 2014

Posts: 2624
#2

02 Nov 2021, 08:10

I recommend to follow a structured approach rather than arbitrarily varying lags in the quest for the "best results". It is usually a good idea to not just restrict the instruments for the first-differenced model to a single lag but instead use at least a few more lags; say, lag(2 5) for example.

eq(diff) specifies instruments for the first-differenced model, eq(level) specifies instruments for the untransformed model in levels.

Not sure what you mean by this. You would want to not reject the null hypothesis of the Sargan-Hansen test, thus you would like to have a p-value larger than your chosen significance level. To be on the safe side, you may want to set a higher significance level than you would normally do.

As an additional observation: You implicitly assumed that all variables in iv(ARD ARDsq TANG CR SALES LEV GR y*, eq(level)) are uncorrelated with the unobserved "fixed effects". This may or may not be a reasonable assumption. Often, you would want to use first differences of (some of) those variables as instruments for the level model.

The following presentation slides might be helpful:
Kripfganz, S. (2019). Generalized method of moments estimation of linear dynamic panel data models. Proceedings of the 2019 London Stata Conference.

https://www.kripfganz.de/stata/
1 like
Comment
Avaz Yusibov

Join Date: Sep 2021

Posts: 8
#3

02 Nov 2021, 23:04

Thank you very much for replying to me. How would your change the formula? Can you please re-arrange the model equation, please? Because I misunderstand the level and difference equation here.

This is my syntax. Can you please re-arrange the model with your own thoughts, please? example is better to understand for me.

xtabond2 ROA l.ROA ARD ARDsq TANG CR SALES LEV GR y* industry*, gmm(l.ROA, l(2 2)) iv(ARD ARDsq TANG CR SALES LEV GR, eq(diff)) iv(ARD ARDsq TANG CR SALES LEV GR y*, eq(level)) nodiffsargan twostep robust
Comment
Sebastian Kripfganz

Join Date: May 2014

Posts: 2624
#4

03 Nov 2021, 04:40

Before you proceed, I recommend that you get a better understanding of the estimators by reading some background literature about the difference and system GMM estimator for dynamic panel data models. My presentation slides above contain several references, for example Roodman's 2009 Stata Journal article.

In a first step, you then need to decide whether you want to classify your variables as endogenous, predetermined, or strictly exogenous with respect to the idiosyncratic error component. In addition, besides year and industry dummies, it is usually advisable to treat all variables as potentially correlated with the time-invariant unit-specific error component. Please see again my presentation slides and the references therein. Assuming that all variables (besides year and industry dummies) are endogenous, you could set up the following system GMM implementation:

Code:

xtdpdgmm ROA L.ROA ARD ARDsq TANG CR SALES LEV GR y* industry*, gmm(L.ROA, lag(1 4) model(diff)) gmm(ARD ARDsq TANG CR SALES LEV GR, lag(2 5) model(diff)) gmm(L.ROA, lag(0 0) diff model(level)) gmm(ARD ARDsq TANG CR SALES LEV GR, lag(1 1) diff model(level)) iv(y* industry*, model(level)) collapse twostep vce(robust) small

For details about the syntax and options, please consult again my presentation slides and the command help file.

https://www.kripfganz.de/stata/
Comment

Avaz Yusibov

Join Date: Sep 2021
Posts: 8

04 Nov 2021, 22:36

Hello Dr Sebastian.

Thank you very much again for your initiative to help me. I would ask a question about chi square of the Hanse-Sargan tests. How much min and max they should be? I have read your ppt file and Roodman(2009). The issue is that, I have watched several videos in youtube, and read lots of comments in statalist and none of them does not use the same methods. That confuses me a lot. I know, xtdpdgmm is used for non-linear models, but some papers related to WCM and firm performance research, they have used xtabond2 code. For example; this paper mentions in this way: "All specifications of Eq. (10) are estimated with the GMM estimator system, using the Stata command xtabond2 [47]. In particular, we consider the right-side variables as endogenous variables and use their lags from t-2 to t-3 as instruments for the equations in differences, and the lagged first-differenced endogenous regressors as instruments for the level equations. In contrast, time dummies are considered to be exoge- nous variables." If possible, can you please write the syntax for this type of explanation.

Your above code gave me this result.

Code:

Generalized method of moments estimation

Fitting full model:
Step 1         f(b) =  .01483229
Step 2         f(b) =  .16030475

Group variable: id                           Number of obs         =      1638
Time variable: Year                          Number of groups      =       227

Moment conditions:     linear =      59      Obs per group:    min =         1
                    nonlinear =       0                        avg =  7.215859
                        total =      59                        max =        10

                                   (Std. Err. adjusted for 227 clusters in id)
------------------------------------------------------------------------------
             |              WC-Robust
         ROA |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         ROA |
         L1. |    .428002   .1914573     2.24   0.026     .0507324    .8052717
             |
         ARD |  -.0005593    .001533    -0.36   0.716      -.00358    .0024615
       ARDsq |  -1.34e-06   5.29e-06    -0.25   0.801    -.0000118    9.08e-06
        TANG |  -.2197456   .1871677    -1.17   0.242    -.5885625    .1490713
          CR |  -.0246327   .0291492    -0.85   0.399    -.0820716    .0328062
       SALES |   .1491611     .05595     2.67   0.008     .0389108    .2594114
         LEV |  -.4005525   .2633979    -1.52   0.130    -.9195824    .1184773
          GR |    .142973   .0890178     1.61   0.110     -.032438     .318384
       year1 |          0  (omitted)
       year2 |   .1039732   .0365318     2.85   0.005     .0319867    .1759596
       year3 |   .0881445   .0312371     2.82   0.005     .0265913    .1496977
       year4 |   .0605548   .0330754     1.83   0.068    -.0046208    .1257305
       year5 |   .0782326   .0303052     2.58   0.010     .0185156    .1379496
       year6 |   .0349172   .0306876     1.14   0.256    -.0255532    .0953876
       year7 |   .0291707   .0299506     0.97   0.331    -.0298475    .0881889
       year8 |   .0286389   .0288672     0.99   0.322    -.0282444    .0855223
       year9 |   .0630019   .0283416     2.22   0.027     .0071543    .1188494
      year10 |          0  (omitted)
      year11 |  -.0736073   .0243996    -3.02   0.003     -.121687   -.0255276
   industry1 |  -.1519642   .1773551    -0.86   0.392    -.5014452    .1975169
   industry2 |  -.2565257   .2120583    -1.21   0.228      -.67439    .1613386
   industry3 |  -.3745091   .2134513    -1.75   0.081    -.7951183    .0461002
   industry4 |  -.1496469    .184586    -0.81   0.418    -.5133765    .2140827
   industry5 |  -.3514499   .2016478    -1.74   0.083    -.7488002    .0459003
   industry6 |  -.1367069    .140865    -0.97   0.333    -.4142836    .1408698
   industry7 |  -.4943237   .3267777    -1.51   0.132    -1.138244     .149597
   industry8 |          0  (omitted)
   industry9 |          0  (omitted)
  industry10 |  -.3355815   .1898879    -1.77   0.079    -.7097587    .0385956
  industry11 |   .1271968   .1690844     0.75   0.453    -.2059868    .4603804
       _cons |  -.4184963   .4146979    -1.01   0.314    -1.235665    .3986726
------------------------------------------------------------------------------
Instruments corresponding to the linear moment conditions:
 1, model(diff):
   L1.L.ROA L2.L.ROA L3.L.ROA L4.L.ROA
 2, model(diff):
   L2.ARD L3.ARD L4.ARD L5.ARD L2.ARDsq L3.ARDsq L4.ARDsq L5.ARDsq L2.TANG
   L3.TANG L4.TANG L5.TANG L2.CR L3.CR L4.CR L5.CR L2.SALES L3.SALES L4.SALES
   L5.SALES L2.LEV L3.LEV L4.LEV L5.LEV L2.GR L3.GR L4.GR L5.GR
 3, model(level):
   D.L.ROA
 4, model(level):
   L1.D.ARD L1.D.ARDsq L1.D.TANG L1.D.CR L1.D.SALES L1.D.LEV L1.D.GR
 5, model(level):
   year2 year3 year4 year5 year6 year7 year8 year9 year10 industry2 industry3
   industry4 industry5 industry6 industry7 industry9 industry10 industry11
 6, model(level):
   _cons

. 

. estat overid

Sargan-Hansen test of the overidentifying restrictions
H0: overidentifying restrictions are valid

2-step moment functions, 2-step weighting matrix       chi2(32)    =   36.3892
                                                       Prob > chi2 =    0.2716

2-step moment functions, 3-step weighting matrix       chi2(32)    =   47.8251
                                                       Prob > chi2 =    0.0357

Last edited by Avaz Yusibov; 04 Nov 2021, 23:30.

Comment

Sebastian Kripfganz

Join Date: May 2014

Posts: 2624
#6

05 Nov 2021, 03:29

I believe all you need to do is changing lag(2 5) into lag(2 3) in order to use "their lags from t-2 to t-3 as instruments". xtdpdgmm can do almost everything you can do with xtabond2 and a few more things.

Regarding the p-value of the Hansen test, there are no established thresholds. The recent paper by Kiviet (2020, Econometrics and Statistics) might provide some insights.

https://www.kripfganz.de/stata/
Comment
Avaz Yusibov

Join Date: Sep 2021

Posts: 8
#7

06 Nov 2021, 05:53

Thank you very much Dr. Sebastian for your great help. I appreciate your warm help!
Comment
Avaz Yusibov

Join Date: Sep 2021

Posts: 8
#8

20 Nov 2021, 00:17

I would have another question related to GMM. How can I use the formula to control for unobservable heterogeneity? Can I go ahead with two-step System GMM if the independent variables are Heteroskedastic? Does robust command is remedy for heterogeneity? How can I add cluster(id) into the formula?

Last edited by Avaz Yusibov; 20 Nov 2021, 00:33.
Comment
Sebastian Kripfganz

Join Date: May 2014

Posts: 2624
#9

20 Nov 2021, 12:33

Heteroskedasticity and heterogeneity are very different concepts. I am not sure what you have in mind. Also, heteroskedastic independent variables are not a reason for concern. We are usually concerned about heteroskedasticity in the error term (which may be functionally related to the independent variables). Two-step robust standard errors account for heteroskedastic errors. Given that id is your panel identifier, vce(robust) is identical to vce(cluster id).

https://www.kripfganz.de/stata/
Comment
Avaz Yusibov

Join Date: Sep 2021

Posts: 8
#10

26 Nov 2021, 06:51

I mean unobservable heterogeneity above

xtabond2 ROA l.ROA ARD ARDsq TANG CR SALES LEV GR y* industry*, gmm(l.ROA ARD ARDsq TANG CR SALES LEV GR, eq(diff) collapse l(2 3)) iv(l(2 2).(l.ROA ARD ARDsq TANG CR SALES LEV GR) y* industry*, eq(level)) nodiffsargan twostep robust small orthogonal

xtabond2 ROA l.ROA ARD ARDsq TANG CR SALES LEV GR y* industry*, gmm(l.ROA ARD ARDsq TANG CR SALES LEV GR, eq(diff) collapse l(2 5)) iv(l(2 2).(l.ROA ARD ARDsq TANG CR SALES LEV GR) y* industry*, eq(level)) nodiffsargan twostep robust small

Which one of those can be right commands? I have unbalanced data and I know that the orthogonality condition fits for unbalanced data. I have taken all variables as endogenous in line with previous study written in this form: "All specifications of Eq. (10) are estimated with the GMM estimator system [3], using the Stata command xtabond2 [47]. In particular, we consider the right-side variables as endogenous variables and use their lags from t-2 to t-3 as instruments for the equations in differences, and the lagged first-differenced endogenous regressors as instruments for the level equations. In contrast, time dummies are considered to be exogenous"

I have posted it once above, I did the same way, but could not get a normal result. Can you please help me with that. Are the above commands okay if other resuls (J statistics, number of groups over than instruments, and others) are okay? Also can you please add cluster id command? I could not use it in stata. \

Thanks in advance

Last edited by Avaz Yusibov; 26 Nov 2021, 06:53.
Comment
Sebastian Kripfganz

Join Date: May 2014

Posts: 2624
#11

29 Nov 2021, 02:50

Neither of those commands will be right. Aside from the time dummies, the instruments specified with option iv(l(2 2).(l.ROA ARD ARDsq TANG CR SALES LEV GR) y* industry*, eq(level)) are invalid if there is unobserved heterogeneity. The iv() option does not automatically create first differences of the instruments for the level model. I recommend to use the gmm() option with suboption collapse instead.

In accordance with the quoted statement, you may try the following:

Code:

xtdpdgmm ROA l.ROA ARD ARDsq TANG CR SALES LEV GR y* industry*, gmm(l.ROA, m(diff) collapse l(1 2)) gmm(ARD ARDsq TANG CR SALES LEV GR, m(diff) collapse l(2 3)) gmm(l.ROA, m(level) collapse l(0 0)) gmm(ARD ARDsq TANG CR SALES LEV GR, m(level) collapse l(1 1)) iv(y* industry*, m(level)) twostep vce(cluster id) small

If you want to use orthogonal deviations, the command differs slightly:

Code:

xtdpdgmm ROA l.ROA ARD ARDsq TANG CR SALES LEV GR y* industry*, gmm(l.ROA, m(fod) collapse l(0 1)) gmm(ARD ARDsq TANG CR SALES LEV GR, m(fod) collapse l(1 2)) gmm(l.ROA, m(level) collapse l(0 0)) gmm(ARD ARDsq TANG CR SALES LEV GR, m(level) collapse l(1 1)) iv(y* industry*, m(level)) twostep vce(cluster id) small

(Note that the lag specifications are different when using orthogonal deviations with xtdpdgmm compared to xtabond2.)

https://www.kripfganz.de/stata/
Comment
Avaz Yusibov

Join Date: Sep 2021

Posts: 8
#12

30 Nov 2021, 22:21

I tried but it did not give good result.If possible, can you please write the syntax for xtabond2? some papers have used it, not xtdpdgmm. But they do not expose how they have written
Comment
Sebastian Kripfganz

Join Date: May 2014

Posts: 2624
#13

01 Dec 2021, 01:49

If you choose the same instruments in xtabond2, you will receive the same results as with xtdpdgmm. Just switching between these commands does not give you "better" results. What do you mean by your statement that the results are "not good"?

https://www.kripfganz.de/stata/
Comment

Announcement

Am I correct with the tow step GMM syntax?

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment