Dear Statalisters,

When working with panel data, testing for serial correlation is standard practice. For example, the Arellano and Bond (1991) is regularly used as a specification test after estimating linear dynamic panel data models. Especially in the context of GMM estimation, evidence of serially correlated residuals often implies invalidity of some instrumental variables, and therefore estimator inconsistency.

The Arellano and Bond (1991) test and the closely related Yamagata (2008) test check for serial correlation in the first-differenced residuals. Those tests can have low power, especially against alternatives with very high serial correlation (close to a random walk). Recently, Jochmans (2020) proposed a portmanteau test and showed that it is more powerful than existing tests, at least if T (the number of time periods) is very small. However, when T becomes larger, the number of covariance restrictions to be tested increases quadratically with T, which can lead to severe loss of power. This is akin to the too-many-instruments problem of GMM estimation. It is less of a problem when N (the number of groups) is very large, but can quickly bite (even for T as small as, say, 10) when N is comparatively small (as with typical macroeconomic data sets).

In current work, we (Kripfganz, Demetrescu, and Hosseinkouchack, 2024) propose alternative tests by borrowing the idea of "collapsing" and "curtailing" from the dynamic panel data GMM literature to reduce the number of covariance restrictions in the test. We find that a particularly powerful test is one that evaluates the covariance between "seasonal differences" (which we also call "sandwich differences") and first differences of the residuals. In combination with the collapsing approach, it does not suffer from the power loss of the portmanteau test. It also retains power against highly correlated alternatives, unlike the earlier tests based entirely on first differences.

Note that the Arellano and Bond (1991), Yamagata (2008), and our Kripfganz, Demetrescu, and Hosseinkouchack (2024) tests are special cases of the Jochman (2020) approach, by considering linear combinations instead of the whole set of possible covariance restrictions. All of these tests are now implemented in my new xtdpdserial package:

The command simply works as a postestimation command. It currently supports the estimation commands regress, xtreg (with option fe or re) and my own commands xtdpdgmm and xtdpdbc. This should cover a large range of possible applications.

It is difficult to support other commands, because many do not provide the score option with predict. Those scores are needed to properly compute the test statistic (unless all regressors are strictly exogenous and uncorrelated with the group-specific effects, which really only holds in the case of xtreg, re.) If those scores are available, as with xtdpdgmm, the regressors are allowed to be strictly exogenous, predetermined, or endogenous. This is a big advantage over alternative tests, such as the popular Wooldridge-Drukker test implemented in xtserial, which rely on the assumption of strictly exogenous regressors (and therefore rule out dynamic models). Moreover, the new tests implemented in xtdpdserial are all robust to heteroskedasticity.

The documentation of the new tests (Kripfganz, Demetrescu, and Hosseinkouchack, 2024) is currently still under preparation. For now, the Remarks section in the xtdpdserial help file should give you the gist of the idea. Also, stay tuned for this year's London Stata Conference, where I will present this new command.

Here is a simple example for these tests after a fixed-effects regression:

Interestingly, in this example the Arellano and Bond (1991) and the Yamagata (2008) tests do not reject the null hypothesis, while the more powerful tests reject. This could be a manifestation of the problem mentioned earlier that tests bases on first differences have low power to detect high autocorrelation. Note also, while the above Arellano and Bond (1991) test is a test for no second-order serial correlation in the first-differenced residuals, this would rule out serial correlation up to order 3 in levels under the null hypothesis. (This is what is stated in the test output above.)

For details on the syntax and options, please consult the help file. There are two syntaxes. The above, Syntax 1, is used to obtain a single test result. With Syntax 2, multiple tests can be obtained at once:

(The output is omitted because these are the same tests as those above.)

xtdpdserial can also be used as a standalone command for serial correlation testing on a specific variable, e.g.

However, this would generally not be valid if the specified variable contains regression residuals.

References:

Questions and suggestions welcome.

When working with panel data, testing for serial correlation is standard practice. For example, the Arellano and Bond (1991) is regularly used as a specification test after estimating linear dynamic panel data models. Especially in the context of GMM estimation, evidence of serially correlated residuals often implies invalidity of some instrumental variables, and therefore estimator inconsistency.

The Arellano and Bond (1991) test and the closely related Yamagata (2008) test check for serial correlation in the first-differenced residuals. Those tests can have low power, especially against alternatives with very high serial correlation (close to a random walk). Recently, Jochmans (2020) proposed a portmanteau test and showed that it is more powerful than existing tests, at least if T (the number of time periods) is very small. However, when T becomes larger, the number of covariance restrictions to be tested increases quadratically with T, which can lead to severe loss of power. This is akin to the too-many-instruments problem of GMM estimation. It is less of a problem when N (the number of groups) is very large, but can quickly bite (even for T as small as, say, 10) when N is comparatively small (as with typical macroeconomic data sets).

In current work, we (Kripfganz, Demetrescu, and Hosseinkouchack, 2024) propose alternative tests by borrowing the idea of "collapsing" and "curtailing" from the dynamic panel data GMM literature to reduce the number of covariance restrictions in the test. We find that a particularly powerful test is one that evaluates the covariance between "seasonal differences" (which we also call "sandwich differences") and first differences of the residuals. In combination with the collapsing approach, it does not suffer from the power loss of the portmanteau test. It also retains power against highly correlated alternatives, unlike the earlier tests based entirely on first differences.

Note that the Arellano and Bond (1991), Yamagata (2008), and our Kripfganz, Demetrescu, and Hosseinkouchack (2024) tests are special cases of the Jochman (2020) approach, by considering linear combinations instead of the whole set of possible covariance restrictions. All of these tests are now implemented in my new xtdpdserial package:

Code:

net install xtdpdserial, from(http://www.kripfganz.de/stata/)

It is difficult to support other commands, because many do not provide the score option with predict. Those scores are needed to properly compute the test statistic (unless all regressors are strictly exogenous and uncorrelated with the group-specific effects, which really only holds in the case of xtreg, re.) If those scores are available, as with xtdpdgmm, the regressors are allowed to be strictly exogenous, predetermined, or endogenous. This is a big advantage over alternative tests, such as the popular Wooldridge-Drukker test implemented in xtserial, which rely on the assumption of strictly exogenous regressors (and therefore rule out dynamic models). Moreover, the new tests implemented in xtdpdserial are all robust to heteroskedasticity.

The documentation of the new tests (Kripfganz, Demetrescu, and Hosseinkouchack, 2024) is currently still under preparation. For now, the Remarks section in the xtdpdserial help file should give you the gist of the idea. Also, stay tuned for this year's London Stata Conference, where I will present this new command.

Here is a simple example for these tests after a fixed-effects regression:

Code:

. webuse abdata . quietly xtreg n w k, fe vce(robust) *** Jochmans (2020) portmanteau test *** . xtdpdserial, pm portmanteau test chi2(35) = 82.6457 H0: no autocorrelation of any order Prob > chi2 = 0.0000 *** Arellano and Bond (1991) test for second-order autocorrelation in first differences *** . xtdpdserial, difference collapse lagrange(2 2) collapsed test in first differences chi2(1) = 1.2134 H0: no autocorrelation up to order 3 Prob > chi2 = 0.2707 *** Yamagata (2008) test *** . xtdpdserial, difference collapse collapsed test in first differences chi2(6) = 8.7452 H0: no autocorrelation of any order Prob > chi2 = 0.1884 *** Kripfganz, Demetrescu, and Hosseinkouchack (2024) tests *** . xtdpdserial, sdifference collapse order(2) collapsed test in seasonal differences chi2(1) = 47.0566 H0: no autocorrelation up to order 2 Prob > chi2 = 0.0000 . xtdpdserial, sdifference collapse collapsed test in seasonal differences chi2(6) = 50.9795 H0: no autocorrelation of any order Prob > chi2 = 0.0000

For details on the syntax and options, please consult the help file. There are two syntaxes. The above, Syntax 1, is used to obtain a single test result. With Syntax 2, multiple tests can be obtained at once:

Code:

. xtdpdserial, statistics(pm dc(2 2) dc sdc(2) sdc)

xtdpdserial can also be used as a standalone command for serial correlation testing on a specific variable, e.g.

Code:

. xtdpdserial n, sdifference collapse collapsed test in seasonal differences chi2(6) = 60.5385 H0: no autocorrelation of any order Prob > chi2 = 0.0000

References:

- Arellano, M., and S. R. Bond (1991). Some tests of specification for panel data: Monte Carlo evidence and an application to employment equations.
*Review of Economic Studies 58*: 277-297. - Jochmans, K. (2020). Testing for correlation in error-component models.
*Journal of Applied Econometrics 35*: 860-878. - Kripfganz, S., M. Demetrescu, and M. Hosseinkouchack (2024). Serial correlation testing in error component models with moderately small T.
*Manuscript*, University of Exeter. - Yamagata, T. (2008). A joint serial correlation test for linear panel data models.
*Journal of Econometrics 146*: 135-145.

Questions and suggestions welcome.

## Comment