Arellano-Bond in Stata: xtabond, xtabond2, or xtdpdgmm if I'm including an interaction? And can I enter differenced focal variables?

Paul Jameson

Join Date: Aug 2020

Posts: 3
#16

29 Aug 2020, 12:18

Originally posted by Sebastian Kripfganz View Post

Unbalanced panel data should not be a problem. If Stata takes a long time to compute the results with the large data set, it is just because of the total number of observations, not because the data set is unbalanced.

With your reduced data set, there are only 3 consecutive time periods used in the estimation. Note that you effectively lose 2 time periods because of the lags of the dependent variable. To compute an AR(2) test statistic, you need more than those 3 effective time periods.

Dear Dr. Kripfganz

Many thanks indeed for your valuable comment and suggestion.

As you mentioned, “Note that you effectively lose 2 time periods because of the lags of the dependent variable. To compute an AR(2) test statistic, you need more than those 3 effective time periods”.
I took under consideration your advice. In the balanced panel (smaller number of observations) I filtered the data in such a way that I have 6 consecutive years per company for lustrums. It worked well, considering that my model uses second lag of dependent variable. I have several scenarios with different lags in the independent variables. I implemented control for years (called “dano”) and company size (Micro, Small, Medium). For analysis simplicity I exported the results in a word table to compare scenarios (table includes: Hansen test, Sargan test, AR(1), AR(2)).

Dr. Kripfganz your kind advice on how to decide (protocol if any) which is (are) the best scenario(s)

Thank you very much again,

Kind regards,

Paul.
Attached Files

GMMctrYSgmmhsar1_6.doc (206.8 KB, 1 view)

GMMctrYSgmmhsar13_18.doc (218.8 KB, 1 view)

GMMctrYSgmmhsar7_12.doc (196.1 KB, 1 view)
Comment
Sebastian Kripfganz

Join Date: May 2014

Posts: 2603
#17

30 Aug 2020, 03:58

You can use the Hansen test and Difference-in-Hansen tests as well as the Andrews-Lu model and moment selection criteria to discriminate between models. Please have a look at the section on "Model selection" in my 2019 London Stata Conference and the paper by Kiviet (2020) that is referenced in my presentation:
Kripfganz, S. (2019). Generalized method of moments estimation of linear dynamic panel data models. Proceedings of the 2019 London Stata Conference.

https://www.kripfganz.de/stata/
Comment
Paul Jameson

Join Date: Aug 2020

Posts: 3
#18

06 Sep 2020, 11:37

Dear Dr. Kripfganz

Again thank you indeed for your valued comments.

I tried to follow your recommendation from your London Stata conference 2019, regarding model selection (slide 90).
However, a dichotomy revealed itself before me.
Find below the models implemented under xtdpdgmm and xtabond2

Firstly. I tried another angle, and I used the command xtdpdgmm as per suggestion from the presentation.

set more off
xtdpdgmm Wroa1 l.Wroa1 L2.Wroa1 Winvperiod1 l.Wpayperiod1 Warperiod1 l.Wcurrasstotasset1 Micro Small Medium, ///
twostep iv(GDPgrowth) gmm(l.Winvperiod1 l.Wpayperiod1 l.Warperiod1, lag(1 2)) vce(robust) teffects
estat overid
estat serial, ar(1/3)
estimate store xtdpdgmmctrYS1
estat mmsc xtdpdgmmctrYS1

Secondly. So far, I have been working with xtabond2 and it is my understanding that both xtabond2 and xtdpdgmm should provide the
same results using the adequate coding. I tried to replicate the equations but the results were not the same.
On slide 6 “Equivalent system-GMM implementations in Stata” I noticed the following structural command syntax:

xtabond2 L(0/1).n w k, gmm(L.n w k, lag(1 3)) h(2) two
xtdpdgmm L(0/1).n w k, gmm(L.n w k, l(1 3) m(d)) gmm(L.n w k, d l(0 0)) w(ind) two

Maybe the models did not yield the same results, because the highlighted terms are not familiar to me or I needed to implement other options in the code.
Your kind help with this.

set more off
global danolist dano*
xtabond2 Wroa1 l.Wroa1 L2.Wroa1 Winvperiod1 l.Wpayperiod1 Warperiod1 l.Wcurrasstotasset1 Micro Small Medium $danolist, ///
twostep ivstyle(GDPgrowth) gmmstyle(l.Winvperiod1 l.Wpayperiod1 l.Warperiod1, lag(1 2)) robust small orthogonal
estimate store GMMctrYS1

On the xtabond2 estimations. To determine Akaike and Bayesian (AIC) (BIC) I used estat ic but Stata replied: “likelihood information not found in last estimation results”. So I manually calculated it using:
AIC = n*ln(SSR) + 2*k
BIC = n*ln(SSR) + k*ln(n)
where:
n is the sample size and k is the number of estimated parameters.

SSR was estimated using:
predict e
gen e2 = e^2
total e2

The model with the lowest AIC BIC result is selected
(Don’t know if this calculation is ok)
NOTE: All GMM estimations implemented time and industry dummies.

Dr. Kripfganz your kind advice on how to solve this conundrum.

Kind regards,

Paul
Comment
Sebastian Kripfganz

Join Date: May 2014

Posts: 2603
#19

07 Sep 2020, 02:31

Originally posted by Paul Jameson View Post

xtdpdgmm Wroa1 l.Wroa1 L2.Wroa1 Winvperiod1 l.Wpayperiod1 Warperiod1 l.Wcurrasstotasset1 Micro Small Medium, ///
twostep iv(GDPgrowth) gmm(l.Winvperiod1 l.Wpayperiod1 l.Warperiod1, lag(1 2)) vce(robust) teffects

You did not specify the (sub-)option model(). By default, all instruments refer to the level model. It is unlikely that this is what you intend to do.

Originally posted by Paul Jameson View Post

global danolist dano*
xtabond2 Wroa1 l.Wroa1 L2.Wroa1 Winvperiod1 l.Wpayperiod1 Warperiod1 l.Wcurrasstotasset1 Micro Small Medium $danolist, ///
twostep ivstyle(GDPgrowth) gmmstyle(l.Winvperiod1 l.Wpayperiod1 l.Warperiod1, lag(1 2)) robust small orthogonal

The ivstyle() option without the suboption equation() does not produce separate instruments for the transformed model and the level model. Unless you know what that option is doing and you indeed intend to this, do not use the ivstyle() option without the suboption equation(). The ivstyle() option without the equation() suboption cannot be replicated with xtdpdgmm. As a general recommendation, always explicitly specify the suboption equation() (or model() for xtdpdgmm) to make sure that you are really specifying the desired model.
Also note that the orthogonal option produces (in most cases) incorrect estimates due to a bug in xtabond2.

Originally posted by Paul Jameson View Post

On the xtabond2 estimations. To determine Akaike and Bayesian (AIC) (BIC) I used estat ic but Stata replied: “likelihood information not found in last estimation results”. So I manually calculated it using:
AIC = n*ln(SSR) + 2*k
BIC = n*ln(SSR) + k*ln(n)
where:
n is the sample size and k is the number of estimated parameters.

SSR was estimated using:
predict e
gen e2 = e^2
total e2

The model with the lowest AIC BIC result is selected
(Don’t know if this calculation is ok)

The MMSC-AIC and MMSC-BIC reported by estat mmsc after xtdpdgmm are not the conventional AIC / BIC. The conventional criteria just focus on the number of estimated coefficients. The MMSC take the number of moment conditions into account. xtabond2 does not calculate these criteria.

https://www.kripfganz.de/stata/
Comment
Sandy Lovejoy

Join Date: Aug 2020

Posts: 9
#20

03 Apr 2021, 23:34

Originally posted by Sebastian Kripfganz View Post

The hope would be that the lagged dependent variables takes care of this serial correlation. If the model passes the Arellano-Bond serial correlation test, you should be fine.

If model(diff) lagrange(1 4) is valid for the first gmmiv() set, then model(level) diff lagrange(0 0) would be valid for the second gmmiv() set.
If you suspect remaining serial error correlation, then you need to adjust both lag ranges, e.g. model(diff) lagrange(2 4) for the first set and model(level) diff lagrange(1 1) for the second set.

Hi, Sebastian. I'm returning to this paper many months later, and I have a question (I hope this will catch your attention!).

In the final version, I used
model(diff) lagrange(2 2) for the first set of instruments and
model(level) diff lagrange(1 1) for the second set of instruments.

Then, in the Arellano-Bond test, I got

Code:

Arellano-Bond test for autocorrelation of the first-differenced residuals H0: no autocorrelation of order 1: z = -0.3523 Prob > |z| = 0.8146 H0: no autocorrelation of order 2: z = -3.1657 Prob > |z| = 0.0014 H0: no autocorrelation of order 3: z = -1.6235 Prob > |z| = 0.1055

My understanding was that these results mean I passed the Arellano-Bond test of serial autocorrelation, because I shifted the instruments back one lag, so in the Arellano-Bond test I am now interested in whether there is auto-correlation in order 3, not order 2. I have auto-correlation in order 2, but none in order 3, so I pass the test.

Is that correct? Or do we always have to not reject the null for order 2 when using the A-B test?
Thanks for any insights you might have!

Last edited by Sandy Lovejoy; 04 Apr 2021, 00:18.
Comment
Sebastian Kripfganz

Join Date: May 2014

Posts: 2603
#21

05 Apr 2021, 05:15

Based on a strict interpretation of the test results, you instruments might be valid. However, AR(3) test is not too comforting and the non-rejection of the AR(1) test is an indication of an unusual serial correlation pattern. It is usually a better approach to construct a model that passes the Arellano-Bond tests in the usual way (reject AR(1) but not reject any higher-order tests). If that is not possible for whatever reason, you might get away with your approach of using deeper lags, although these potentially might become relatively week instruments.

https://www.kripfganz.de/stata/
1 like
Comment

Announcement

Comment

Comment

Comment

Comment

Comment

Comment