Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Hausman test with omitted variables - still valid?

    Hi Statalist

    In the help files for -hausman- we see an example (eg1 on p.894, Stata 16) in which all the regressors vary with time.

    My question is whether the Hausman test is still valid when there are time-invariant regressors that the FE model subsequently remove?

    In the simulated data below, the unobserved time-invariant component is correlated with regressor x1. As such, it would suggest that we should apply an FE model instead of an RE model.

    However, x1 is time-invariant and the resultant Hausman test supports the use of RE. Does this mean that in order for the Hausman test to be valid (ignoring homoskedasticity for the time being), both sets of RE and FE equations must contain the same set of regressors?

    Thanks.

    Code:
    *****************
    clear
    set seed 111
    set obs 1000
    *****************
    generate id     = _n
    generate year   = 2000
    generate x1     = runiform()> .5
    
    generate nu     = rnormal()
    generate alpha = x1 + nu
    *****************
    expand 5
    bysort id:  replace year = year + _n
    *****************
    generate x2    = rbeta(2,3)
    generate u     = rnormal()
    *****************
    generate y  = (3) + (1) * x1 + (1) * x2 + alpha + u 
    // the unobserved time-invariant component alpha is correlated with regressor x1,
    // so FE would be more appropriate
    
    xtset id year
    
    xtsum // to check that x1 is time-invariant whereas x2 varies with time
    
    quietly xtreg y x1 x2, re
    estimates store RE
    
    quietly xtreg y x1 x2, fe
    estimates store FE
    
    hausman FE RE, sigmamore
    
                     ---- Coefficients ----
                 |      (b)          (B)            (b-B)     sqrt(diag(V_b-V_B))
                 |       FE           RE         Difference          S.E.
    -------------+----------------------------------------------------------------
              x2 |    .9144598     .9098057        .0046541         .016269
    ------------------------------------------------------------------------------
                               b = consistent under Ho and Ha; obtained from xtreg
                B = inconsistent under Ha, efficient under Ho; obtained from xtreg
    
        Test:  Ho:  difference in coefficients not systematic
    
                      chi2(1) = (b-B)'[(V_b-V_B)^(-1)](b-B)
                              =        0.08
                    Prob>chi2 =      0.7748

  • #2
    Not only is it valid, it’s almost always desirable to include time constant variables in the RE estimation. By putting in time-constant controls there’s a better chance that the time-varying covariates are exogenous. If you omit x1 and it’s correlated with x2 then RE is inconsistent. The Hausman test is intended to only compare coefficients that have some variation across i and t.

    Incidentally, if you’re going to apply this test you should look into a robust version of it.

    Comment


    • #3
      Thanks very much Jeff for the explanation.

      If I can please inquire a bit further, the data in my example was set up in such a way that results in FE being a more appropriate choice. However, the results of the Hauseman test (if I interpreted prob > chi2 = 0.7748 correctly) suggest I should use a RE model instead. I am struggling to understand how to reconcile this apparant point of disparity. In your response "The Hausman test is intended to only compare coefficients that have some variation across i and t.", does this mean the Hausman test suggests I use RE if x2 is to be the only regressor but it cannot say anything if both x1 and x2 are to be included?

      Regarding a robust version of the Hausman test, might that be the Mundlak test?

      Thanks.

      Comment


      • #4
        Junran:
        you may want to take a look at -search RHAUSMAN-.
        Kind regards,
        Carlo
        (Stata 18.0 SE)

        Comment


        • #5
          Thank you very much Carlo for alerting me to the -rhausman- package.

          Comment


          • #6
            Junran: You generated x2 so that it is independent of x1 and alpha. Therefore, leaving alpha in the error term causes no systematic bias in the coefficient on x2: RE is consistent in this case. You should generate alpha to be correlated with a time constant component of x2. I would generate x2 to depend directly on alpha.

            Oh, and yes you can use the Mundlak version of the FE estimator to obtain a robust test. Or rhausman, although I need to look at it.

            JW

            Comment


            • #7
              Jeff:
              Thanks for your help. I re-did the simulation such that x2 depends directly on alpha as per your comment and the result now came back as intended. This has helped to clarify my understanding greatly. Much appreciated.

              Comment

              Working...
              X