Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • hausman test

    Hi al,

    I am running a panel model on the effect of promotion on working performance.
    I tried to implement a hausman test for deciding if i shall use fe or re:
    Code:
     xtreg (variables), fe
    ewstimates store fe
    xtreg (variables), re
    estimates store re
    delivers the following output:
    Code:
       Test:  Ho:  difference in coefficients not systematic
    
                     chi2(23) = (b-B)'[(V_b-V_B)^(-1)](b-B)
                              =    -4.65    chi2<0 ==> model fitted on these
                                            data fails to meet the asymptotic
                                            assumptions of the Hausman test;
                                            see suest for a generalized test
    Does anyone have experience with this problem? I fail to understand why the test will not work.

    Best, Alexander-Florian

  • #2
    Alex:
    it's a really frequent nuisance of the -hausman- test (with a limited sample, you cannot take for granted that VCE Matrix is positive definite).
    A fix in point is to add the -sigmamore- option available with -hausman-.
    If what above won't do the trick, you can consider the user-written programme -xtoverid-
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Carlo,

      hello again and welcome to my second post

      Thanks for your hints,

      the sigmamore option delivers
      Code:
       Test:  Ho:  difference in coefficients not systematic
      
                       chi2(23) = (b-B)'[(V_b-V_B)^(-1)](b-B)
                                =        1.35
                      Prob>chi2 =      1.0000
      the p-value of 1.000 makes me a little suspicious (as it seems "too high")
      Does this simply imply it stongly recommends the random effects model? Or points the 1.0000 out that there is a mistake somewhere?

      Cheers, Alex

      Comment


      • #4
        Alex:
        try -xtoverid- and see if it also recommends -re- specification
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment


        • #5
          Carlo,

          xtoverid doesnt work as it delivers:
          Code:
           cAge#c:  operator invalid
          I searched in earlier postst and saw that some facing the same problem concluded xtoverid cant handle factor variables. Is this still valid? And how would i deal with a assumed squared relationship of Age (c.Age##c.Age) and i.year

          Cheers, Alex

          Comment


          • #6
            Update:
            carlo, I meanwhile found your solution.
            So my Code is now like:
            Code:
             xi: xtreg performance promoted_in_observation_period Age Age_squared (...) i.year, re robust
            delivering:
            Code:
            Test of overidentifying restrictions: fixed vs random effects
            Cross-section time-series model: xtreg re  robust cluster(ID)
            Sargan-Hansen statistic  42.372  Chi-sq(23)   P-value = 0.0082
            So am I fine with FE modell then? If i dont use the robust option in the 1. Code the p-value gets > 0.05. In this case RE would be better..

            Cheers, Alex

            Comment


            • #7
              Alex:
              - glad you have found the -xi:- prefix workaround yourself;
              - robust/cluster standard errors can obviously impact on -xtoverid- oucome: if you detect evidence of heteroskedasticity and/or autocorrelation in your dataset invoking these options sounds wise;
              - you asre correct in interpreting the results of -xtoverid- for robust/cluster and defaulst standard errors;
              - by the way, with non-default standard errors, -hausman- is unfeasible, nor adopting cluster/robust standard errors after -hausman- test is correct.
              Kind regards,
              Carlo
              (Stata 19.0)

              Comment


              • #8
                Carlo, many thanks for your helpful comments!

                To be honest I dont fully understand your last sidenote:

                Originally posted by Carlo Lazzaro View Post
                - by the way, with non-default standard errors, -hausman- is unfeasible, nor adopting cluster/robust standard errors after -hausman- test is correct.
                Does that mean Hausman is not possible with heteroskedasticic data? Nevertheless, my -xtoverid- output stays unaffected of this, right?

                Additionally, 2 issues arise from that conclusion:

                1) Now I want to analyze whether my dataset suffers from heteroskedasticity.
                Since I am working on a fe panel model the hettest command is not possible.
                During my research how to test for heteroskedasticity I stumbled over user-written xttest2 and xttest3. Xttest2 doesnt work, but xttest3 does.
                So after running my regression
                Code:
                 xtreg performane promoted_in_observation_period (…) c.Age##c.Age i.year, fe
                xttest3 delivers
                Code:
                 H0: sigma(i)^2 = sigma^2 for all i
                chi2 (195)  =   3.0e+34
                Prob>chi2 =      0.0000
                implying that null hypothesis of homoscedasticity is rejected strongly.

                Do I have a massive heteroscedasticity problem now?
                is
                Code:
                 xtreg (…) , fe robust
                “enough” to deal with that strong heteroskedasticity?

                2) I used the “ ,fe robust” option for the -xtoverid- in order to check whether to use fe or re (see above).
                Isn’t it somehow circular to conclude to use the fe model, from testing a heteroskedasticity-corrected model, and afterwards check whether heteroscedasticity is a problem at all?
                But if I go the other way round I don’t know whether I can use the xttest3 command, as I didn’t check firstly whether I can use a fe model rather than a re. (xtoverid)
                I hope it is clear, what I mean

                Cheers, Alex

                Comment


                • #9
                  Alex:
                  1) -hausman- test allows default standard errors only. Hence, if you suspect heteroskedasticity and/or autocorrelation with you -xtreg- suitable data you should invoke robust/cluster standard error, which points you directly to -xtoverid- to choose between the -fe-and -re- specification.
                  2) if you have heteroskeadsticity and/or autocorrelation and you wisely invoke robust/cluster standard errors, there's no scope in re-testing if your data suffers from heteroskeadsticity and/or autocorrelation.
                  Kind regards,
                  Carlo
                  (Stata 19.0)

                  Comment


                  • #10
                    Carlo,
                    thanks for the reply.

                    1) I see. But given hausman can deal only with default std.err. and xtoverid can deal with non-default std.err., is there any point in running xtoverid without the , fe robust option?
                    2) Can I simply assume i have heteroskedasticity in my data set and therefore use the , robust option by default for xtreg as well as for xtoverid? That sounds somehow not really academic unassailable to me?

                    Thx, Alex

                    Comment


                    • #11
                      Alex:
                      1) as per -xtoverid- help file, you can run this test with or without the cluster/robust option. If you invoke default standard errors, the results are the same as the ones reported by -hausman-, as you can see from the following toy-example:
                      Code:
                      . use "http://www.stata-press.com/data/r15/nlswork.dta"
                      (National Longitudinal Survey.  Young Women 14-26 years of age in 1968)
                      
                      . xtreg ln_wage age, re
                      
                      Random-effects GLS regression                   Number of obs     =     28,510
                      Group variable: idcode                          Number of groups  =      4,710
                      
                      R-sq:                                           Obs per group:
                           within  = 0.1026                                         min =          1
                           between = 0.0877                                         avg =        6.1
                           overall = 0.0774                                         max =         15
                      
                                                                      Wald chi2(1)      =    3140.35
                      corr(u_i, X)   = 0 (assumed)                    Prob > chi2       =     0.0000
                      
                      ------------------------------------------------------------------------------
                           ln_wage |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
                      -------------+----------------------------------------------------------------
                               age |   .0185667   .0003313    56.04   0.000     .0179174    .0192161
                             _cons |   1.120439   .0112038   100.01   0.000      1.09848    1.142398
                      -------------+----------------------------------------------------------------
                           sigma_u |  .36972456
                           sigma_e |  .30349389
                               rho |  .59743613   (fraction of variance due to u_i)
                      ------------------------------------------------------------------------------
                      
                      . xtoverid
                      
                      Test of overidentifying restrictions: fixed vs random effects
                      Cross-section time-series model: xtreg re  
                      Sargan-Hansen statistic  17.401  Chi-sq(1)    P-value = 0.0000
                      
                      . xtreg ln_wage age, fe
                      
                      Fixed-effects (within) regression               Number of obs     =     28,510
                      Group variable: idcode                          Number of groups  =      4,710
                      
                      R-sq:                                           Obs per group:
                           within  = 0.1026                                         min =          1
                           between = 0.0877                                         avg =        6.1
                           overall = 0.0774                                         max =         15
                      
                                                                      F(1,23799)        =    2720.20
                      corr(u_i, Xb)  = 0.0314                         Prob > F          =     0.0000
                      
                      ------------------------------------------------------------------------------
                           ln_wage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
                      -------------+----------------------------------------------------------------
                               age |   .0181349   .0003477    52.16   0.000     .0174534    .0188164
                             _cons |   1.148214   .0102579   111.93   0.000     1.128107     1.16832
                      -------------+----------------------------------------------------------------
                           sigma_u |  .40635023
                           sigma_e |  .30349389
                               rho |  .64192015   (fraction of variance due to u_i)
                      ------------------------------------------------------------------------------
                      F test that all u_i=0: F(4709, 23799) = 8.81                 Prob > F = 0.0000
                      
                      . estimates store fe
                      
                      . xtreg ln_wage age, re
                      
                      Random-effects GLS regression                   Number of obs     =     28,510
                      Group variable: idcode                          Number of groups  =      4,710
                      
                      R-sq:                                           Obs per group:
                           within  = 0.1026                                         min =          1
                           between = 0.0877                                         avg =        6.1
                           overall = 0.0774                                         max =         15
                      
                                                                      Wald chi2(1)      =    3140.35
                      corr(u_i, X)   = 0 (assumed)                    Prob > chi2       =     0.0000
                      
                      ------------------------------------------------------------------------------
                           ln_wage |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
                      -------------+----------------------------------------------------------------
                               age |   .0185667   .0003313    56.04   0.000     .0179174    .0192161
                             _cons |   1.120439   .0112038   100.01   0.000      1.09848    1.142398
                      -------------+----------------------------------------------------------------
                           sigma_u |  .36972456
                           sigma_e |  .30349389
                               rho |  .59743613   (fraction of variance due to u_i)
                      ------------------------------------------------------------------------------
                      
                      . estimates store re
                      
                      . hausman fe re
                      
                                       ---- Coefficients ----
                                   |      (b)          (B)            (b-B)     sqrt(diag(V_b-V_B))
                                   |       fe           re         Difference          S.E.
                      -------------+----------------------------------------------------------------
                               age |    .0181349     .0185667       -.0004318        .0001055
                      ------------------------------------------------------------------------------
                                                 b = consistent under Ho and Ha; obtained from xtreg
                                  B = inconsistent under Ha, efficient under Ho; obtained from xtreg
                      
                          Test:  Ho:  difference in coefficients not systematic
                      
                                        chi2(1) = (b-B)'[(V_b-V_B)^(-1)](b-B)
                                                =       16.76
                                      Prob>chi2 =      0.0000
                      
                      .
                      2) the issue with your suggested approach is that -xtreg- allows cluster/robust (put differently, you cannot choose between robust and cluster option, like you do under -regress-) standard errors, which may be biased (in fact, more biased that default standard errors) if the number of clusters is too small. Hence, researcher's judgement call is unavodable.
                      Kind regards,
                      Carlo
                      (Stata 19.0)

                      Comment


                      • #12
                        Carlo,
                        thanks for the clarification.
                        But how can I justify the use of xtreg ..., fe robust option the without testing for heteroskedasticity at all?
                        I am still a little confused why I would not use the xttest3.

                        Best, Alex

                        Comment


                        • #13
                          Alex:
                          you can also visually investigate the residual distribution.
                          Hence, you are not forced to use -xttest3-.
                          However, if you decide to stick with -xttest3- you have two possible outcomes:
                          1) your residual distribution is homoskedastic; no need to re-run -xtreg,fe- with cluster/robust standard errors; run -xtreg,re- with default standard errors; test via -hausman- which specification fits your data better;
                          2 your residual distribution is heterooskedastic; you should re-run -xtreg,fe- with cluster/robust standard errors; run -xtreg,re- with cluster/robust standard errors; test via -xtoverid- which specification fits your data better.
                          Kind regards,
                          Carlo
                          (Stata 19.0)

                          Comment


                          • #14
                            Carlo,

                            thanks again.

                            1) For the visual inspection I stumbled over
                            Code:
                            1. predict s1, xb
                            2. predict s2, residual
                            3. scatter s2 s1
                            Is this correct for my setting with my fe panel data? It confuses me somehow, as i would expect the residuals on the y-axis and the years of my panel (i.e. 2000-2015) on the x axis.

                            2) Thanks a lot, Carlo. This is very helpful!
                            Linked with Heteroskedasticity there is the issue of Autocorrelation. As I understand from literature and prior posts the - , robust - option can control for both under -xtreg-.
                            Does this imply when I act as discussed above I dont need to worry about autocorrelation (given I have heteroskedasticity)? Because if AC occurs its corrected for and if no AC occurs it is no problem anyway.

                            Cheers, A.
                            Last edited by Alex Mueller; 12 Jun 2018, 03:23.

                            Comment


                            • #15
                              Alex:
                              1) visual inspection:
                              Code:
                              . use "http://www.stata-press.com/data/r15/nlswork.dta"
                              (National Longitudinal Survey.  Young Women 14-26 years of age in 1968)
                              
                              . xtreg ln_wage age, fe
                              
                              Fixed-effects (within) regression               Number of obs     =     28,510
                              Group variable: idcode                          Number of groups  =      4,710
                              
                              R-sq:                                           Obs per group:
                                   within  = 0.1026                                         min =          1
                                   between = 0.0877                                         avg =        6.1
                                   overall = 0.0774                                         max =         15
                              
                                                                              F(1,23799)        =    2720.20
                              corr(u_i, Xb)  = 0.0314                         Prob > F          =     0.0000
                              
                              ------------------------------------------------------------------------------
                                   ln_wage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
                              -------------+----------------------------------------------------------------
                                       age |   .0181349   .0003477    52.16   0.000     .0174534    .0188164
                                     _cons |   1.148214   .0102579   111.93   0.000     1.128107     1.16832
                              -------------+----------------------------------------------------------------
                                   sigma_u |  .40635023
                                   sigma_e |  .30349389
                                       rho |  .64192015   (fraction of variance due to u_i)
                              ------------------------------------------------------------------------------
                              F test that all u_i=0: F(4709, 23799) = 8.81                 Prob > F = 0.0000
                              
                              . predict res, e
                              (24 missing values generated)
                              
                              . histogram res
                              (bin=44, start=-1.9891924, width=.11140382)
                              
                              .
                              2) with the robust/cluster options you do not have to worry about heteroskedasticity and/or autocorrelation under -xtreg-.
                              Kind regards,
                              Carlo
                              (Stata 19.0)

                              Comment

                              Working...
                              X