Hausman test with omitted variables - still valid?

Junran Cao

Join Date: May 2019
Posts: 75

Hausman test with omitted variables - still valid?

14 Jul 2019, 20:40

Hi Statalist

In the help files for -hausman- we see an example (eg1 on p.894, Stata 16) in which all the regressors vary with time.

My question is whether the Hausman test is still valid when there are time-invariant regressors that the FE model subsequently remove?

In the simulated data below, the unobserved time-invariant component is correlated with regressor x1. As such, it would suggest that we should apply an FE model instead of an RE model.

However, x1 is time-invariant and the resultant Hausman test supports the use of RE. Does this mean that in order for the Hausman test to be valid (ignoring homoskedasticity for the time being), both sets of RE and FE equations must contain the same set of regressors?

Thanks.

Code:

*****************
clear
set seed 111
set obs 1000
*****************
generate id     = _n
generate year   = 2000
generate x1     = runiform()> .5

generate nu     = rnormal()
generate alpha = x1 + nu
*****************
expand 5
bysort id:  replace year = year + _n
*****************
generate x2    = rbeta(2,3)
generate u     = rnormal()
*****************
generate y  = (3) + (1) * x1 + (1) * x2 + alpha + u 
// the unobserved time-invariant component alpha is correlated with regressor x1,
// so FE would be more appropriate

xtset id year

xtsum // to check that x1 is time-invariant whereas x2 varies with time

quietly xtreg y x1 x2, re
estimates store RE

quietly xtreg y x1 x2, fe
estimates store FE

hausman FE RE, sigmamore

                 ---- Coefficients ----
             |      (b)          (B)            (b-B)     sqrt(diag(V_b-V_B))
             |       FE           RE         Difference          S.E.
-------------+----------------------------------------------------------------
          x2 |    .9144598     .9098057        .0046541         .016269
------------------------------------------------------------------------------
                           b = consistent under Ho and Ha; obtained from xtreg
            B = inconsistent under Ha, efficient under Ho; obtained from xtreg

    Test:  Ho:  difference in coefficients not systematic

                  chi2(1) = (b-B)'[(V_b-V_B)^(-1)](b-B)
                          =        0.08
                Prob>chi2 =      0.7748

Tags: fe, hausman, panel data

Jeff Wooldridge

Join Date: Apr 2014

Posts: 2170
#2

15 Jul 2019, 02:07

Not only is it valid, it’s almost always desirable to include time constant variables in the RE estimation. By putting in time-constant controls there’s a better chance that the time-varying covariates are exogenous. If you omit x1 and it’s correlated with x2 then RE is inconsistent. The Hausman test is intended to only compare coefficients that have some variation across i and t.

Incidentally, if you’re going to apply this test you should look into a robust version of it.
1 like
Comment
Junran Cao

Join Date: May 2019

Posts: 75
#3

15 Jul 2019, 05:34

Thanks very much Jeff for the explanation.

If I can please inquire a bit further, the data in my example was set up in such a way that results in FE being a more appropriate choice. However, the results of the Hauseman test (if I interpreted prob > chi2 = 0.7748 correctly) suggest I should use a RE model instead. I am struggling to understand how to reconcile this apparant point of disparity. In your response "The Hausman test is intended to only compare coefficients that have some variation across i and t.", does this mean the Hausman test suggests I use RE if x2 is to be the only regressor but it cannot say anything if both x1 and x2 are to be included?

Regarding a robust version of the Hausman test, might that be the Mundlak test?

Thanks.
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17712
#4

16 Jul 2019, 01:10

Junran:
you may want to take a look at -search RHAUSMAN-.

Kind regards,
Carlo
(Stata 19.0)
Comment
Junran Cao

Join Date: May 2019

Posts: 75
#5

16 Jul 2019, 05:06

Thank you very much Carlo for alerting me to the -rhausman- package.
Comment
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2170
#6

16 Jul 2019, 05:56

Junran: You generated x2 so that it is independent of x1 and alpha. Therefore, leaving alpha in the error term causes no systematic bias in the coefficient on x2: RE is consistent in this case. You should generate alpha to be correlated with a time constant component of x2. I would generate x2 to depend directly on alpha.

Oh, and yes you can use the Mundlak version of the FE estimator to obtain a robust test. Or rhausman, although I need to look at it.

JW
Comment
Junran Cao

Join Date: May 2019

Posts: 75
#7

16 Jul 2019, 20:30

Jeff:
Thanks for your help. I re-did the simulation such that x2 depends directly on alpha as per your comment and the result now came back as intended. This has helped to clarify my understanding greatly. Much appreciated.
Comment

Announcement

Hausman test with omitted variables - still valid?

Comment

Comment

Comment

Comment

Comment

Comment