Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Handling Autocorrelation in Pooled OLS Panel Data Regression

    Hello, I would like to ask a question. I am currently working with panel data, where the results of the Chow test, Hausman test, and Lagrange Multiplier (LM) test consistently indicate that the pooled effect model is the most appropriate specification. Furthermore, the White test shows no evidence of heteroskedasticity. However, the Wooldridge test detects the presence of autocorrelation in the residuals. In this situation, what would be the most appropriate method to address the autocorrelation issue while maintaining the pooled model framework?

    Thank you

    Best regards,
    Wulan

  • #2
    Wulan:
    just add -robust- or -vce(cluster panelid)- standard errors, provided you have at least 30 panels.
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      clustered errors, xtscc, xtgls corr(ar1), prais, newey

      Comment


      • #4
        Carlo Lazzaro George Ford Thank you,
        I understand that using robust or cluster standard errors is recommended when there are at least 30 panels. In my case, I only have 9 panels (cross-sectional units) and 15 time periods. Would it still be appropriate to use vce(cluster panelid) in this situation?
        Last edited by Wulan Lika; 01 May 2026, 04:03.

        Comment


        • #5
          Wulan:
          no, it is not.
          As George surmised, you should consider a model for T>N panel data regression (say, -xtregar-).
          Kind regards,
          Carlo
          (Stata 19.0)

          Comment


          • #6
            Thank you Carlo Lazzaro
            Since xtregar is for FE/RE models, what is the appropriate method to handle serial correlation in pooled OLS?

            Comment


            • #7
              Wulan:
              no way to apply cluster standard errors with such a limited number of panels.
              As George wisesly suggested, go -newey-.
              Kind regards,
              Carlo
              (Stata 19.0)

              Comment


              • #8
                With small N and large T balanced panels, xtgls and xtpcse will work. In your case (only autocorrelation and T not very large) I would use:
                xtpcse y x, i c(ar1)
                xtgls y x, p(i) c(ar1)
                Manh Hoang-Ba,
                Facebook,
                Eureka! Uni - YouTube,
                ManhHB94 (Manh Hoang Ba),
                Hoàng Bá Mạnh – Kinh tế lượng: Lý thuyết và ứng dụng

                Comment


                • #9
                  Thank you Manh Hoang Ba
                  If I use the xtgls command, how can I obtain the R-squared value for the estimation? Thank you

                  Comment


                  • #10
                    Hi Wulan Lika

                    STATA states that R-squared is not reported with xtgls to avoid misunderstandings. Stata | FAQ: R-squared after xtgls

                    If you want to calculate a measure of the predictive power of results from xtgls, you can manually calculate the three R-squared values: within, between, and overall.

                    Code:
                    webuse invest2, clear
                    xtset company time
                    xtgls invest market stock, panels(hetero) corr(ar1)
                    qui predict double invest_hat if e(sample)
                    
                    qui foreach var of varlist invest invest_hat {
                        egen double `var'_i = mean(`var') if e(sample) , by(company)
                        gen double `var'_w = `var' - `var'_i
                    }
                    
                    *    within
                    qui corr invest_w invest_hat_w if e(sample)
                    scalar r2_w = r(rho)^2
                    di r2_w
                    
                    *    between
                    qui corr invest_i invest_hat_i if e(sample)
                    scalar r2_b = r(rho)^2
                    di r2_b
                    
                    *    overall
                    qui corr invest invest_hat if e(sample)
                    scalar r2_o = r(rho)^2
                    di r2_o
                    Manh Hoang-Ba,
                    Facebook,
                    Eureka! Uni - YouTube,
                    ManhHB94 (Manh Hoang Ba),
                    Hoàng Bá Mạnh – Kinh tế lượng: Lý thuyết và ứng dụng

                    Comment


                    • #11
                      There's very little you can do that's convincing from a statistical inference perspective with N = 9, T = 15. You shouldn't use generalized least squares, as it only has desirable asymptotic properties. Use pooled OLS and report the Driscoll-Kraay standard errors (xtscc) with a lag of one. Even this is pushing it, but it does account for some serial correlation when computing the standard errors.

                      If this is a kind of difference-in-differences analysis I have other suggestions.

                      BTW, all of the diagnostics you computed rely on large N asymptotics, and so I wouldn't take too much from them. But the finding of serial correlation is not surprising.

                      Comment

                      Working...
                      X