Handling Autocorrelation in Pooled OLS Panel Data Regression

Wulan Lika

Join Date: Mar 2026

Posts: 13
#1

Handling Autocorrelation in Pooled OLS Panel Data Regression

29 Apr 2026, 01:21

Hello, I would like to ask a question. I am currently working with panel data, where the results of the Chow test, Hausman test, and Lagrange Multiplier (LM) test consistently indicate that the pooled effect model is the most appropriate specification. Furthermore, the White test shows no evidence of heteroskedasticity. However, the Wooldridge test detects the presence of autocorrelation in the residuals. In this situation, what would be the most appropriate method to address the autocorrelation issue while maintaining the pooled model framework?

Thank you

Best regards,
Wulan
Tags: None
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17854
#2

30 Apr 2026, 08:30

Wulan:
just add -robust- or -vce(cluster panelid)- standard errors, provided you have at least 30 panels.

Kind regards,
Carlo
(Stata 19.0)
Comment
George Ford

Join Date: Aug 2014

Posts: 3337
#3

30 Apr 2026, 10:52

clustered errors, xtscc, xtgls corr(ar1), prais, newey
1 like
Comment
Wulan Lika

Join Date: Mar 2026

Posts: 13
#4

01 May 2026, 03:58

Carlo Lazzaro George Ford Thank you,
I understand that using robust or cluster standard errors is recommended when there are at least 30 panels. In my case, I only have 9 panels (cross-sectional units) and 15 time periods. Would it still be appropriate to use vce(cluster panelid) in this situation?

Last edited by Wulan Lika; 01 May 2026, 04:03.
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17854
#5

01 May 2026, 04:22

Wulan:
no, it is not.
As George surmised, you should consider a model for T>N panel data regression (say, -xtregar-).

Kind regards,
Carlo
(Stata 19.0)
Comment
Wulan Lika

Join Date: Mar 2026

Posts: 13
#6

01 May 2026, 05:48

Thank you Carlo Lazzaro
Since xtregar is for FE/RE models, what is the appropriate method to handle serial correlation in pooled OLS?
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17854
#7

01 May 2026, 07:46

Wulan:
no way to apply cluster standard errors with such a limited number of panels.
As George wisesly suggested, go -newey-.

Kind regards,
Carlo
(Stata 19.0)
Comment
Manh Hoang Ba

Join Date: Aug 2023

Posts: 87
#8

01 May 2026, 09:50

With small N and large T balanced panels, xtgls and xtpcse will work. In your case (only autocorrelation and T not very large) I would use:
xtpcse y x, i c(ar1)
xtgls y x, p(i) c(ar1)

Manh Hoang-Ba,
Facebook,
Eureka! Uni - YouTube,
ManhHB94 (Manh Hoang Ba),
Hoàng Bá Mạnh – Kinh tế lượng: Lý thuyết và ứng dụng
Comment
Wulan Lika

Join Date: Mar 2026

Posts: 13
#9

05 May 2026, 02:15

Thank you Manh Hoang Ba
If I use the xtgls command, how can I obtain the R-squared value for the estimation? Thank you
Comment

Manh Hoang Ba

Join Date: Aug 2023
Posts: 87

#10

05 May 2026, 02:33

Hi Wulan Lika

STATA states that R-squared is not reported with xtgls to avoid misunderstandings. Stata | FAQ: R-squared after xtgls

If you want to calculate a measure of the predictive power of results from xtgls, you can manually calculate the three R-squared values: within, between, and overall.

Code:

webuse invest2, clear
xtset company time
xtgls invest market stock, panels(hetero) corr(ar1)
qui predict double invest_hat if e(sample)

qui foreach var of varlist invest invest_hat {
    egen double `var'_i = mean(`var') if e(sample) , by(company)
    gen double `var'_w = `var' - `var'_i
}

*    within
qui corr invest_w invest_hat_w if e(sample)
scalar r2_w = r(rho)^2
di r2_w

*    between
qui corr invest_i invest_hat_i if e(sample)
scalar r2_b = r(rho)^2
di r2_b

*    overall
qui corr invest invest_hat if e(sample)
scalar r2_o = r(rho)^2
di r2_o

Manh Hoang-Ba,
Facebook,
Eureka! Uni - YouTube,
ManhHB94 (Manh Hoang Ba),
Hoàng Bá Mạnh – Kinh tế lượng: Lý thuyết và ứng dụng

Comment

Jeff Wooldridge

Join Date: Apr 2014

Posts: 2291
#11

07 May 2026, 07:49

There's very little you can do that's convincing from a statistical inference perspective with N = 9, T = 15. You shouldn't use generalized least squares, as it only has desirable asymptotic properties. Use pooled OLS and report the Driscoll-Kraay standard errors (xtscc) with a lag of one. Even this is pushing it, but it does account for some serial correlation when computing the standard errors.

If this is a kind of difference-in-differences analysis I have other suggestions.

BTW, all of the diagnostics you computed rely on large N asymptotics, and so I wouldn't take too much from them. But the finding of serial correlation is not surprising.
1 like
Comment

Announcement

Handling Autocorrelation in Pooled OLS Panel Data Regression

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment