Help understand GMM method of xtdpdgmm

Bil Sudar

Join Date: Jul 2024

Posts: 4
#1

Help understand GMM method of xtdpdgmm

25 Jul 2024, 08:31

Hi,
I am currently testing endogeneity for my regression of ESGt-1 towards COE . I have tried running both xtabond2 and xtdpdgmm. It only works if my code is as attached.
1. Can someone explain to me if my code is correct/make sense? Also what are some of the papers that I can read to understand better?
2. My AR2 is significant and only is insignificant at AR3, why is that?
xtdpdgmm coe18 L.coe18 L.ESGScore L.log_TA2_w L.btm2_w L.lev_ta_w L.ROA_w L.capex_ta_w L.div_ta2_w, gmm(L.
> coe18 L.ESGScore, lag(2 4)) twostep vce(r)

Generalized method of moments estimation

Fitting full model:
Step 1 f(b) = .00012871
Step 2 f(b) = .82908424

Group variable: firm_id Number of obs = 352
Time variable: year Number of groups = 46

Moment conditions: linear = 58 Obs per group: min = 1
nonlinear = 0 avg = 7.652174
total = 58 max = 12

(Std. err. adjusted for 46 clusters in firm_id)
------------------------------------------------------------------------------
| WC-Robust
coe18 | Coefficient std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
coe18 |
L1. | .9075901 .0510667 17.77 0.000 .8075012 1.007679
|
ESGScore |
L1. | -.0001207 .0000486 -2.49 0.013 -.0002159 -.0000255
|
log_TA2_w |
L1. | .0002031 .0009916 0.20 0.838 -.0017403 .0021465
|
btm2_w |
L1. | .0071664 .0032929 2.18 0.030 .0007125 .0136203
|
lev_ta_w |
L1. | -.0124381 .0098277 -1.27 0.206 -.0316999 .0068238
|
ROA_w |
L1. | .0245419 .0228735 1.07 0.283 -.0202894 .0693733
|
capex_ta_w |
L1. | .1038046 .0267035 3.89 0.000 .0514667 .1561426
|
div_ta2_w |
L1. | .0749887 .1024789 0.73 0.464 -.1258663 .2758437
|
_cons | -.0069952 .0231876 -0.30 0.763 -.052442 .0384516
------------------------------------------------------------------------------
Instruments corresponding to the linear moment conditions:
1, model(level):
2012:L2.L.coe18 2013:L2.L.coe18 2014:L2.L.coe18 2015:L2.L.coe18
2016:L2.L.coe18 2017:L2.L.coe18 2018:L2.L.coe18 2019:L2.L.coe18
2020:L2.L.coe18 2021:L2.L.coe18 2013:L3.L.coe18 2014:L3.L.coe18
2015:L3.L.coe18 2016:L3.L.coe18 2017:L3.L.coe18 2018:L3.L.coe18
2019:L3.L.coe18 2020:L3.L.coe18 2021:L3.L.coe18 2014:L4.L.coe18
2015:L4.L.coe18 2016:L4.L.coe18 2017:L4.L.coe18 2018:L4.L.coe18
2019:L4.L.coe18 2020:L4.L.coe18 2021:L4.L.coe18 2011:L2.L.ESGScore
2012:L2.L.ESGScore 2013:L2.L.ESGScore 2014:L2.L.ESGScore 2015:L2.L.ESGScore
2016:L2.L.ESGScore 2017:L2.L.ESGScore 2018:L2.L.ESGScore 2019:L2.L.ESGScore
2020:L2.L.ESGScore 2021:L2.L.ESGScore 2012:L3.L.ESGScore 2013:L3.L.ESGScore
2014:L3.L.ESGScore 2015:L3.L.ESGScore 2016:L3.L.ESGScore 2017:L3.L.ESGScore
2018:L3.L.ESGScore 2019:L3.L.ESGScore 2020:L3.L.ESGScore 2021:L3.L.ESGScore
2013:L4.L.ESGScore 2014:L4.L.ESGScore 2015:L4.L.ESGScore 2016:L4.L.ESGScore
2017:L4.L.ESGScore 2018:L4.L.ESGScore 2019:L4.L.ESGScore 2020:L4.L.ESGScore
2021:L4.L.ESGScore
2, model(level):
_cons

. estat overid

Sargan-Hansen test of the overidentifying restrictions
H0: overidentifying restrictions are valid

2-step moment functions, 2-step weighting matrix chi2(49) = 38.1379
Prob > chi2 = 0.8691

2-step moment functions, 3-step weighting matrix chi2(49) = 42.3927
Prob > chi2 = 0.7363

. estat serial, ar(1,2,3)

Arellano-Bond test for autocorrelation of the first-differenced residuals
H0: no autocorrelation of order 1 z = -2.6019 Prob > |z| = 0.0093
H0: no autocorrelation of order 2 z = -3.9205 Prob > |z| = 0.0001
H0: no autocorrelation of order 3 z = 0.9087 Prob > |z| = 0.3635

Thank you for anyone who can help me
Tags: gmm, panel, panel data, xtabond2, xtdpdgmm
Sebastian Kripfganz

Join Date: May 2014

Posts: 2581
#2

25 Jul 2024, 08:47

Note that xtdpdgmm by default specifies all instruments for the model in levels. This is most likely not what you want to do. For the "difference GMM" estimator, specify option model(diff).

The significant AR(2) test indicates that the dynamics of your model might be misspecified. It strikes me here that you have specified all regressors with their first lag. Unless there are strong theoretical reasons why all those variables only affect the outcome variable with a delay of 1 time period, this is probably not a good idea. Some people lag the regressors to circumvent endogeneity issues, but again this is generally a bad idea. You should specify the regressors as you think they affect the outcome variable. If you expect an immediate effect in the same period, do not lag your regressors. If you suspect endogeneity, adjust your instruments accordingly (but not the regressors).

If you have not seen it yet, the following presentation slides might be helpful:
Kripfganz, S. (2019). Generalized method of moments estimation of linear dynamic panel data models. Proceedings of the 2019 London Stata Conference.

Also, there is a YouTube recording of a presentation I once gave online: https://www.youtube.com/live/5EnPiBMUYE4

https://www.kripfganz.de/stata/
Comment
Bil Sudar

Join Date: Jul 2024

Posts: 4
#3

25 Jul 2024, 09:41

Hi,

Thank you for your answer. My original model uses a lag for all my regressors to see its impact towards the COE. I also ran the difference GMM there is significance in the AR(2). I am not quite sure how to handle this issue? Thank you
Comment
Sebastian Kripfganz

Join Date: May 2014

Posts: 2581
#4

25 Jul 2024, 10:39

One approach to dealing with first-order serial correlation (as indicated by a significance AR(2) test) is to add higher-order lags of the dependent variable (and/or the regressors) to the model. You could try adding L2.coe18 as a regressor and check whether the test still detects serial correlation.

Another alternative, given that there is no evidence of higher-order serial correlation (since AR(3) does not reject), would be to start with higher-order lags of your instruments; i.e. lag(3 4) instead of lag(2 4).

https://www.kripfganz.de/stata/
Comment

Announcement

Help understand GMM method of xtdpdgmm

Comment

Comment

Comment