xtabond2 and deeper lags

Myungun Kim

Join Date: Jul 2017

Posts: 8
#1

xtabond2 and deeper lags

12 Jul 2017, 04:40

Hello. I'm new to here. (Correct me if there is something I'm missing!)
I have questions regarding the use of xtabond2.

I ran a regression with xtabond2
I am using a panel data with n= 7543 and T =9.

The problem I'm having is, unless I use deeper lags as instruments, I cannot pass Hansen-test.
(P-values are so close to 0.). The syntax I'm using for the System GMM is

xtabond2 y .l.y x1 x2 gmm(y, (6,7)) gmm(x1, x2, lag(6,7)) iv(i.year) robust twostep

However, even if I can pass Hasen-test, would these deeper lags constitute valid instruments?

Thank you in advance!
Tags: gmm, panel, xtabond2
Sebastian Kripfganz

Join Date: May 2014

Posts: 2576
#2

12 Jul 2017, 06:47

If possible, please show us the output table with the estimation results (using CODE delimiters as explained in the FAQ #12.3).

For the instruments, you would usually start with the second lag of the dependent variable and the first lag of the independent variables (or contemporaneous terms, depending on whether the variables are predetermined or strictly exogenous) instead of lag 6. Why did you choose the latter? If your variables are not very persistent, the sixth and seventh lag may not be strong instruments because they are only weekly correlated with the instrumented variables.

https://www.kripfganz.de/stata/
1 like
Comment

Myungun Kim

Join Date: Jul 2017
Posts: 8

12 Jul 2017, 07:18

Dear Sebastian

Thank you for your answer.

I am running the regression below in which total factor productivity (ltfp) is regressed on outsourcing intensity (routsales) and r&d intensity (rndva).
I suspect both of the regressors are endogenous. So I used

Code:

 xtabond2 ltfp l.ltfp routsales rndva i.year, iv(i.year) gmm(routsales rndva, lag(2 3)) gmm(ltfp, lag(2 3)) twostep robust artests(3)

However, if I use second and third lags as instruments..
I get the following

Code:


Sargan test of overid. restrictions: chi2(64)   = 172.90  Prob > chi2 =  0.000
  (Not robust, but not weakened by many instruments.)
Hansen test of overid. restrictions: chi2(64)   = 128.38  Prob > chi2 =  0.000
  (Robust, but weakened by many instruments.)

Difference-in-Hansen tests of exogeneity of instrument subsets:
  GMM instruments for levels
    Hansen test excluding group:     chi2(39)   =  82.00  Prob > chi2 =  0.000
    Difference (null H = exogenous): chi2(25)   =  46.37  Prob > chi2 =  0.006
  gmm(routsales rndva, lag(2 3))
    Hansen test excluding group:     chi2(19)   =  62.34  Prob > chi2 =  0.000
    Difference (null H = exogenous): chi2(45)   =  66.04  Prob > chi2 =  0.022
  gmm(ltfp, lag(2 3))
    Hansen test excluding group:     chi2(40)   =  41.81  Prob > chi2 =  0.392
    Difference (null H = exogenous): chi2(24)   =  86.57  Prob > chi2 =  0.000
  iv(2006b.year 2007.year 2008.year 2009.year 2010.year 2011.year 2012.year 2013.year 2014.year
>  2015.year)
    Hansen test excluding group:     chi2(56)   =  96.96  Prob > chi2 =  0.001
    Difference (null H = exogenous): chi2(8)    =  31.42  Prob > chi2 =  0.000

which I think does mean that my instruments are not valid. Only when I use 6th and 7th lags I get the following

Code:

Sargan test of overid. restrictions: chi2(28)   =  20.00  Prob > chi2 =  0.865
  (Not robust, but not weakened by many instruments.)
Hansen test of overid. restrictions: chi2(28)   =  30.62  Prob > chi2 =  0.334
  (Robust, but weakened by many instruments.)

Difference-in-Hansen tests of exogeneity of instrument subsets:
  GMM instruments for levels
    Hansen test excluding group:     chi2(14)   =   8.97  Prob > chi2 =  0.833
    Difference (null H = exogenous): chi2(14)   =  21.66  Prob > chi2 =  0.086
  gmm(routsales rndva, lag(6 7))
    Hansen test excluding group:     chi2(11)   =   9.14  Prob > chi2 =  0.609
    Difference (null H = exogenous): chi2(17)   =  21.48  Prob > chi2 =  0.205
  gmm(ltfp, lag(6 7))
    Hansen test excluding group:     chi2(12)   =  12.73  Prob > chi2 =  0.389
    Difference (null H = exogenous): chi2(16)   =  17.89  Prob > chi2 =  0.330
  iv(2006b.year 2007.year 2008.year 2009.year 2010.year 2011.year 2012.year 2013.year 2014.year
>  2015.year)
    Hansen test excluding group:     chi2(20)   =  23.97  Prob > chi2 =  0.244
    Difference (null H = exogenous): chi2(8)    =   6.65  Prob > chi2 =  0.575

As I'm new to this command and very little experience in panel data analysis, any advice would be really helpful
Thank you in advance!

Comment

Sebastian Kripfganz

Join Date: May 2014

Posts: 2576
#4

12 Jul 2017, 07:35

There might be remaining serial correlation of the error term. Is the Arellano-Bond AR(2) test rejecting the null hypothesis of no second-order serial correlation of the first-differenced error term? In that case, it is probably a better idea to directly include further lags of the dependent (and/or independent) variable(s) as regressors.

In addition, I highly recommend to use the suboption equation(level) for the time dummy instruments, that is iv(i.year, eq(level)). See the following topic for a discussion of this matter: System GMM - Time dummies

Moreover, the degrees of freedom for the Hansen test might be incorrect if there are omitted variables (in particular, omitted categories of the time dummies). This is a bug in xtabond2. This can only be avoided if the time dummies are specified explicitly without the factor notation. See my command xtseqreg with its option teffects as an alternative to xtabond2: XTSEQREG: new Stata command for sequential / two-stage (GMM) estimation of linear panel models

Last edited by Sebastian Kripfganz; 12 Jul 2017, 07:39. Reason: last paragraph added

https://www.kripfganz.de/stata/
1 like
Comment

Myungun Kim

Join Date: Jul 2017
Posts: 8

12 Jul 2017, 08:22

Thank you for your suggestion! Especially the command XTSEQREG (I'll have a look at it once I get the hang of XTABOND2)!

Following your advice, I ran a regression with the following syntax

Code:

xtabond2 ltfp l.ltfp routsales rndva yr1-yr10, iv(yr1-yr10, eq(level)) gmm(routsales rndva, lag(2 3)) gmm(ltfp, lag(2 3)) twostep robust artests(3)

And I'm adding a result for AR tests.

Code:

------------------------------------------------------------------------------
Arellano-Bond test for AR(1) in first differences: z = -42.15  Pr > z =  0.000
Arellano-Bond test for AR(2) in first differences: z =   4.08  Pr > z =  0.000
Arellano-Bond test for AR(3) in first differences: z =  -0.37  Pr > z =  0.715
------------------------------------------------------------------------------
Sargan test of overid. restrictions: chi2(64)   = 171.66  Prob > chi2 =  0.000
  (Not robust, but not weakened by many instruments.)
Hansen test of overid. restrictions: chi2(64)   = 115.93  Prob > chi2 =  0.000
  (Robust, but weakened by many instruments.)

The null is rejected for AR(2). This may mean that I need to use further lags such as fourth or fifth as instruments.
However, as I told you, including fourth or fifth lags as below does not fix the problem.

Code:

xtabond2 ltfp l.ltfp routsales rndva yr1-yr10, iv(yr1-yr10, eq(level)) gmm(routsales rndva, l
> ag(4 5)) gmm(ltfp, lag(4 5)) twostep robust artests(3)

Code:

------------------------------------------------------------------------------
Arellano-Bond test for AR(1) in first differences: z =  -7.80  Pr > z =  0.000
Arellano-Bond test for AR(2) in first differences: z =   4.15  Pr > z =  0.000
Arellano-Bond test for AR(3) in first differences: z =  -0.13  Pr > z =  0.898
------------------------------------------------------------------------------
Sargan test of overid. restrictions: chi2(46)   =  62.42  Prob > chi2 =  0.054
  (Not robust, but not weakened by many instruments.)
Hansen test of overid. restrictions: chi2(46)   =  68.16  Prob > chi2 =  0.019
  (Robust, but weakened by many instruments.)

Only from the sixth lag on! :P
However, sixth and seventh lags would make a poor instrument in terms of validity, won't they?
As time dummies are specified explicitly, there would be no problem with the calculation of degrees of freedom, as you have noted above?

Thank you in advance!

Comment

Sebastian Kripfganz

Join Date: May 2014

Posts: 2576
#6

12 Jul 2017, 08:29

What about adding further lags of the dependent variable, e.g,

Code:

xtabond2 ltfp L.ltfp L2.ltfp routsales rndva yr2-yr10, iv(yr2-yr10, eq(level)) gmm(routsales rndva, lag(2 3)) gmm(ltfp, lag(2 3)) twostep robust artests(3)

(Notice that this implies that you have to remove one of the time dummies.)

If none of the coefficients in the regression output is labelled as "omitted" or "empty", then the degrees of freedom should be fine in your case.

At a later stage, it might be worth to use not just two lags (2 and 3) but a couple of more or even all of them, given that your T is really small relative to N such that instrument proliferation is less of an issue.

https://www.kripfganz.de/stata/
1 like
Comment
Myungun Kim

Join Date: Jul 2017

Posts: 8
#7

12 Jul 2017, 08:55

Thank you so much for your kind explanation!

It has been very helpful! :D
Comment

Myungun Kim

Join Date: Jul 2017
Posts: 8

12 Jul 2017, 09:22

Dear Sebastian

Can I ask you one more question?
I did what you have suggested (including the second lag of dep variable) and I also thought that with a large number of observations N, I thought I can use all deeper lags as instruments. Thus, I used,

Code:

xtabond2 ltfp l.ltfp l2.ltfp routsales rndva yr2-yr10, iv(yr2-yr10, eq(level)) gmm(routsales rndva, lag(3 .)) gmm(ltfp, lag(3 .)) twostep robust artests(3)

However, the result does not look good as it shows

Code:

------------------------------------------------------------------------------
Arellano-Bond test for AR(1) in first differences: z =  -8.78  Pr > z =  0.000
Arellano-Bond test for AR(2) in first differences: z =   3.50  Pr > z =  0.000
Arellano-Bond test for AR(3) in first differences: z =   0.79  Pr > z =  0.430
------------------------------------------------------------------------------
Sargan test of overid. restrictions: chi2(99)   = 127.68  Prob > chi2 =  0.028
  (Not robust, but not weakened by many instruments.)
Hansen test of overid. restrictions: chi2(99)   = 156.03  Prob > chi2 =  0.000
  (Robust, but weakened by many instruments.)

According to Hansen test, it may suggest that using all the lags as IV is not a good idea as they do not make a valid instrument?
Am I correct in this reasoning?

Thank you in advance! :D

Comment

Sebastian Kripfganz

Join Date: May 2014

Posts: 2576
#9

12 Jul 2017, 09:38

It is indeed probably still a good idea to restrict the number of lags used as instruments. Probably some solution between two lags and all lags is most reasonable. The collapse suboption might also be a good idea, at least will not do much harm.

Yet, this probably will not help much to get rid of the serial error correlation. If that is possible for you, it might be good to think about adding additional variables that could matter. Such omitted variables can easily be the source of serial error correlation. The Hansen test is just a general test for model misspecification, but there can be many ways of such misspecification.

https://www.kripfganz.de/stata/
1 like
Comment
Bruno De Menna

Join Date: Nov 2018

Posts: 3
#10

24 Sep 2019, 01:45

Hello,

I'm working on a dynamic model (via xtabond2) dedicated to bank risk according to interest rates (with controls at the bank and country level). To check robustness, I compare models using different rates. Is it a mistake to use different lags (for the interest variable and/or control variables) once changing the type of rate ?
My goal here is to optimize the results of Arellano-Bond (1) and (2) as well as Sargan and Hansen tests from one model to another as I'm keeping the same dependent and control variables.

Thank you
Comment

Announcement