Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Panel Data Hausman test


    Hello, I am working with 10years and 46 countries data. All is fine until I tried the Hausman test. I realized that the random effect model estimates all my variables with collinearity issues but fixed effect drops 4 of my dummy variables because of collinearity. The Hausman test predicts fixed effect model as the best model. My question is, should I run the analysis without the variables being dropped because of colinearity in the fix-effect model or I should analyze with them since they will be dropped so that both models can have the same number of independent variables?
    Thank you

  • #2
    Emmanuel:
    the -fe- estimator, as expected, wiped out time-invariant predictors:
    Code:
    .
    
    
    xtreg ln_wage i.race, fe
    note: 2.race omitted because of collinearity
    note: 3.race omitted because of collinearity
    
    Fixed-effects (within) regression               Number of obs     =     28,534
    Group variable: idcode                          Number of groups  =      4,711
    
    R-sq:                                           Obs per group:
         within  = 0.0000                                         min =          1
         between = 0.0050                                         avg =        6.1
         overall =      .                                         max =         15
    
                                                    F(0,23823)        =       0.00
    corr(u_i, Xb)  =      .                         Prob > F          =          .
    
    ------------------------------------------------------------------------------
         ln_wage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
            race |
          black  |          0  (omitted)
          other  |          0  (omitted)
                 |
           _cons |   1.674907   .0018961   883.35   0.000     1.671191    1.678624
    -------------+----------------------------------------------------------------
         sigma_u |  .42456905
         sigma_e |  .32028665
             rho |  .63731204   (fraction of variance due to u_i)
    ------------------------------------------------------------------------------
    F test that all u_i=0: F(4710, 23823) = 8.44                 Prob > F = 0.0000
    
    .
    It's immaterial running the -fe. estimator with/without the time-invariant predictorsd, as they are ommited before any calculation, as you can see from the following toy-example:
    Code:
    . use "http://www.stata-press.com/data/r15/nlswork.dta"
    (National Longitudinal Survey.  Young Women 14-26 years of age in 1968)
    . xtreg ln_wage tenure i.race, fe
    note: 2.race omitted because of collinearity
    note: 3.race omitted because of collinearity
    
    Fixed-effects (within) regression               Number of obs     =     28,101
    Group variable: idcode                          Number of groups  =      4,699
    
    R-sq:                                           Obs per group:
         within  = 0.0972                                         min =          1
         between = 0.1966                                         avg =        6.0
         overall = 0.1373                                         max =         15
    
                                                    F(1,23401)        =    2520.15
    corr(u_i, Xb)  = 0.1395                         Prob > F          =     0.0000
    
    ------------------------------------------------------------------------------
         ln_wage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
          tenure |   .0341807   .0006809    50.20   0.000     .0328462    .0355153
                 |
            race |
          black  |          0  (omitted)
          other  |          0  (omitted)
                 |
           _cons |   1.570329   .0027935   562.14   0.000     1.564854    1.575805
    -------------+----------------------------------------------------------------
         sigma_u |  .39172445
         sigma_e |  .30357621
             rho |  .62477177   (fraction of variance due to u_i)
    ------------------------------------------------------------------------------
    F test that all u_i=0: F(4698, 23401) = 7.80                 Prob > F = 0.0000
    
    . xtreg ln_wage tenure, fe
    
    Fixed-effects (within) regression               Number of obs     =     28,101
    Group variable: idcode                          Number of groups  =      4,699
    
    R-sq:                                           Obs per group:
         within  = 0.0972                                         min =          1
         between = 0.1966                                         avg =        6.0
         overall = 0.1373                                         max =         15
    
                                                    F(1,23401)        =    2520.15
    corr(u_i, Xb)  = 0.1395                         Prob > F          =     0.0000
    
    ------------------------------------------------------------------------------
         ln_wage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
          tenure |   .0341807   .0006809    50.20   0.000     .0328462    .0355153
           _cons |   1.570329   .0027935   562.14   0.000     1.564854    1.575805
    -------------+----------------------------------------------------------------
         sigma_u |  .39172445
         sigma_e |  .30357621
             rho |  .62477177   (fraction of variance due to u_i)
    ------------------------------------------------------------------------------
    F test that all u_i=0: F(4698, 23401) = 7.80                 Prob > F = 0.0000
    
    .
    Kind regards,
    Carlo
    (Stata 18.0 SE)

    Comment


    • #3
      Thanks, Carlo. I think I am being biased here. Even though the Hausman test shows fixed effect as the best model, for some testing, the Hausman test fails and I am worried I will have some inconsistencies in my reporting.



      chi2(4) = (b-B)'[(V_b-V_B)^(-1)](b-B)
      = -3.48 chi2<0 ==> model fitted on these
      data fails to meet the asymptotic
      assumptions of the Hausman test;
      see suest for a generalized test


      Also, the fixed effect gives me P-value that makes the model not to be a good fit while the random gives me a good P-value.

      Comment


      • #4
        Emmanuel:
        1) -hausman- can (pretty) easily let you down.
        As a workaround, you can test the -re- assumption via the community-contributed command -xtoverid- (that you can spot and install just typing -search xtoverid-):
        Code:
        .
        . use "http://www.stata-press.com/data/r15/nlswork.dta"
        (National Longitudinal Survey.  Young Women 14-26 years of age in 1968)
        xi: xtreg ln_wage tenure i.race, re
        i.race            _Irace_1-3          (naturally coded; _Irace_1 omitted)
        
        Random-effects GLS regression                   Number of obs     =     28,101
        Group variable: idcode                          Number of groups  =      4,699
        
        R-sq:                                           Obs per group:
             within  = 0.0972                                         min =          1
             between = 0.2079                                         avg =        6.0
             overall = 0.1569                                         max =         15
        
                                                        Wald chi2(3)      =    3532.05
        corr(u_i, X)   = 0 (assumed)                    Prob > chi2       =     0.0000
        
        ------------------------------------------------------------------------------
             ln_wage |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
        -------------+----------------------------------------------------------------
              tenure |   .0376405   .0006448    58.37   0.000     .0363767    .0389043
            _Irace_2 |  -.1345322   .0120866   -11.13   0.000    -.1582215   -.1108429
            _Irace_3 |   .1039944   .0504227     2.06   0.039     .0051677    .2028211
               _cons |    1.59266   .0066729   238.68   0.000     1.579581    1.605738
        -------------+----------------------------------------------------------------
             sigma_u |  .33623102
             sigma_e |  .30357621
                 rho |  .55090591   (fraction of variance due to u_i)
        ------------------------------------------------------------------------------
        
        . xtoverid
        
        Test of overidentifying restrictions: fixed vs random effects
        Cross-section time-series model: xtreg re  
        Sargan-Hansen statistic 241.991  Chi-sq(1)    P-value = 0.0000
        
        .
        The -xtoverid- outcome points towards -fe- specification.
        Please note that, being glorious but a bit old-fashioned, the community-contributed programme -xtoverid- does not support the -fvvarlist- notation. The usual fix is to prefix the -xtreg- code with -the -xi:- prefix.

        2) Judging a regression model by its p-values is hardly scientific. Please note that unsignificant coefficients are as informative as significant ones.
        That said, I would be much more worried about the risk of model misspecification, that you can easiliy test via the following approach:
        Code:
        .
        . use "http://www.stata-press.com/data/r15/nlswork.dta"
        (National Longitudinal Survey.  Young Women 14-26 years of age in 1968)
        
        xtreg ln_wage tenure i.race, fe
        note: 2.race omitted because of collinearity
        note: 3.race omitted because of collinearity
        
        Fixed-effects (within) regression               Number of obs     =     28,101
        Group variable: idcode                          Number of groups  =      4,699
        
        R-sq:                                           Obs per group:
             within  = 0.0972                                         min =          1
             between = 0.1966                                         avg =        6.0
             overall = 0.1373                                         max =         15
        
                                                        F(1,23401)        =    2520.15
        corr(u_i, Xb)  = 0.1395                         Prob > F          =     0.0000
        
        ------------------------------------------------------------------------------
             ln_wage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
        -------------+----------------------------------------------------------------
              tenure |   .0341807   .0006809    50.20   0.000     .0328462    .0355153
                     |
                race |
              black  |          0  (omitted)
              other  |          0  (omitted)
                     |
               _cons |   1.570329   .0027935   562.14   0.000     1.564854    1.575805
        -------------+----------------------------------------------------------------
             sigma_u |  .39172445
             sigma_e |  .30357621
                 rho |  .62477177   (fraction of variance due to u_i)
        ------------------------------------------------------------------------------
        F test that all u_i=0: F(4698, 23401) = 7.80                 Prob > F = 0.0000
        
        . predict fitted, xb
        (433 missing values generated)
        
        . g sq_fitted=fitted^2
        (433 missing values generated)
        
        . xtreg ln_wage fitted sq_fitted , fe
        
        Fixed-effects (within) regression               Number of obs     =     28,101
        Group variable: idcode                          Number of groups  =      4,699
        
        R-sq:                                           Obs per group:
             within  = 0.1093                                         min =          1
             between = 0.2233                                         avg =        6.0
             overall = 0.1513                                         max =         15
        
                                                        F(2,23400)        =    1435.15
        corr(u_i, Xb)  = 0.1528                         Prob > F          =     0.0000
        
        ------------------------------------------------------------------------------
             ln_wage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
        -------------+----------------------------------------------------------------
              fitted |   6.735689   .3231599    20.84   0.000     6.102275    7.369104
           sq_fitted |  -1.594469   .0896669   -17.78   0.000    -1.770222   -1.418716
               _cons |  -5.108404   .2891933   -17.66   0.000    -5.675242   -4.541566
        -------------+----------------------------------------------------------------
             sigma_u |    .387985
             sigma_e |  .30155209
                 rho |  .62341011   (fraction of variance due to u_i)
        ------------------------------------------------------------------------------
        F test that all u_i=0: F(4698, 23400) = 7.76                 Prob > F = 0.0000
        
        . test sq_fitted
        
         ( 1)  sq_fitted = 0
        
               F(  1, 23400) =  316.20
                    Prob > F =    0.0000
        
        .
        Reaching statistical significance, the -test- outcome warns about regression model misspecification.
        Kind regards,
        Carlo
        (Stata 18.0 SE)

        Comment


        • #5
          Hello Carlo, thanks for the detailed help.
          Ii installed the xtoverid but when I run the xtoverid command after the regression, I get this error:
          . xtoverid
          Error - must have ivreg2/ivreg29/ivreg28 version 2.1.15 or greater installed

          For the model specification, I will check it out by reading on it as I do not understand the fitted and sq_fitted idea.

          2. I would be glad if you can show me how you export the results the way it appears here. I copy my results in a picture format, that looks horrible. I will be glad if you can direct me to a docx or file on both the results export as you have in your post and the fitted analysis. Thanks for the time and help.

          Comment


          • #6
            Emmanuel:
            1) simply updated the version of the user-written commands that were mentioned in -xtoverid- warning message;
            2) you can get further details about the way misspecification can be investigated taking a look at -linktest- command entry in Stata .pdf manual;
            3) it's easier to do than to explain: just click on the # toggle and paste within the appearing CODE delimiters what you copied from a .do file or from your Stata session.
            Kind regards,
            Carlo
            (Stata 18.0 SE)

            Comment


            • #7
              Thanks so much, Carlo.

              Comment

              Working...
              X