Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Pooled OLS vs RE

    Hello,

    I want to estimate panel regression

    yit = a + b*Zit + c*Xt + d*(Zit*Xt) + uit

    with either pooled OLS or the FGLS RE estimator. Thereby, Z is a dummy variable, X is a variable which only varies over time but not cross-sectionally, and Z*X is the interaction between dummy variable Z and X. My main interest is in the coefficient estimate for b. According to the Hausman test the RE assumption is fine. So, both pooled OLS and RE should be consistent. Unfortunately, however, the sign of the coefficient estimate for the regression constant and, in particular, for b differs between the OLS and FGLS RE estimator while the other coefficient estimates are very similar. What estimator should I trust more in this case pooled OLS or (more efficient) FGLS RE?


    . * RE estimation
    . eststo RE: qui xtreg Y Z X XZ, re cluster(id)


    . * Pooled OLS estimation
    . eststo POLS: qui reg Y Z X XZ, cluster(id)


    . * Tabulate the results
    . esttab *, mtitles r2(%7.6f) scalars(r2_w) obslast star


    --------------------------------------------
    (1) (2)
    RE POLS
    --------------------------------------------
    Z 0.387*** -0.493***
    (4.22) (-16.97)


    X 0.514*** 0.516***
    (11.19) (11.65)


    XZ 1.332*** 1.316***
    (24.12) (20.45)


    _cons -0.0776 0.176***
    (-0.87) (8.54)
    --------------------------------------------
    R-sq 0.164138
    r2_w 0.165
    N 864483 864483
    --------------------------------------------
    t statistics in parentheses
    * p<0.05, ** p<0.01, *** p<0.001



    Thanks for sharing your insights on this.

    Kind regards,
    Daniel




  • #2
    Daniel:
    I would go -xtreg,re-.
    Just out of curiosity: how did you perform -hausman- with clustered standard errors? Thanks.
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Hi Carlo,

      Thanks for the reply. Your recommendation matches with my gut feeling. However, the results from POLS would better fit with existing theory... Is there a theoretical argument to rely on the RE estimator other than that it is more efficient?

      I implemented a regression-based version of the Hausman test as discussed in Wooldridge (2010, Section 10.7.3). To do so, I proceeded as follows:

      1. I computed the panel-level time-series averages Zbar and XZbar of variables Z and XZ (since they both vary cross-sectionally and over time)
      . by id: egen Zbar = mean(Z)
      . by id: egen XZbar = mean(XZ)


      2. I then added Zbar and XZbar to the above regression model and estimated the extended regression model with -xtreg, re-
      . xtreg Y Z X XZ Zbar XZbar, re cluster(id)

      3. Finally, I used -test- to test whether the coefficient estimates for XZ and Z are jointly equal to zero. The null of the test could not be rejected. Hence, I consider the RE assumption to hold.
      . test Zbar XZbar

      Kind regards,
      Daniel

      Comment


      • #4
        Daniel:
        thansk for providing further details about -hausman- test.
        As I do not know your research field, I cannot say why POLS is more in line than RE with the existing theory.
        That said, see http://www.soderbom.net/metrix2/lec6_7.pdf, page 11, third bullet point from the bottom.
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment


        • #5
          Dear Daniel,

          Adding to Carlo's valuable advice, I would say that you should test the strict exogeneity assumption and decide based on that.

          Best wishes,

          Joao

          Comment


          • #6
            Originally posted by Joao Santos Silva View Post
            Dear Daniel,

            Adding to Carlo's valuable advice, I would say that you should test the strict exogeneity assumption and decide based on that.

            Best wishes,

            Joao


            Dear Joao,

            Im currently writing my master thesis where question arises how to test for strict exogeneity as an assumption for a fixed effects regression?

            Best regards,

            Frederic

            Comment


            • #7
              Dear Ricky Virnich,

              Please check Wooldridge's textbook; he describes the test.

              Best wishes,

              Joao

              Comment


              • #8
                Daniel, what you have encountered is a very rare and very disturbing phenomenon. The coefficients for OLS and RE are same magnitude and statistically significant, except that they have the opposite sign... If I encountered something like this, I would firstly check whether I do not have some errors in my code.

                I think that in this case you should trust the OLS, because, as mentioned above, OLS is valid under weak exogeneity (the regressors have to be contemporaneously uncorrelated with the error term) whereas the RE estimator is valid under strong/strict exogeneity (the regressors have to be uncorrelated with the error for all time periods.)

                A Hausman test of strict/strong exogenety is simply a test whether 0.387*** == -0.493*** in your table, which it obviously is not.

                The test of exogeneity that you have implemented is known as the Mundlak(1978) approach. (Life is tough and unfair, both Mundlak(1978), and Hausman figured out how to test this thing about the same time, and the Mundlak(1978) approach is much more practical, and yet we call it a Hausman test, where we should probably call it a Mundlak test).

                Given your puzzling results regarding the huge difference between the RE and OLS, you might want to try again the Mundlak(1978) approach Hausman test however this time estimating by OLS with robust clustered errors (instead of RE as you have done now).

                Comment

                Working...
                X