Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Problem with Hausman Test (already read 4 previous threads but still stuck)

    First of all, I would like to say that this is my first post on this forum and that I appreciate the advice I have already used by reading previous threads in many cases.

    I am currently writing my master thesis and I could say that I am a bit stuck with the empirical part due to the problems occur by the Hausman tests. I am using a panel data set consisted of the 5 biggest US retail companies:

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str10 datadate int fiscalyear byte fiscalquarter str4 tickersymbol str21 companyname float(inv rev netinc mv ltdebt empl roa Date CompanyID)
    "28/2/2005"  2005 2 "COST" "COSTCO WHOLESALE CORP" 4002.654 12658.077  498.605         . 732.589 110 .030312054 16495 1
    "31/5/2005"  2005 3 "COST" "COSTCO WHOLESALE CORP" 4040.253  12006.21  708.393         . 715.448 110  .04221042 16587 1
    "31/8/2005"  2005 4 "COST" "COSTCO WHOLESALE CORP" 4014.699 16709.936 1063.092         . 710.675 110  .06437659 16679 1
    "30/11/2005" 2006 1 "COST" "COSTCO WHOLESALE CORP" 4825.284 12933.346  215.818  23767.61  546.82 127  .01241602 16770 1
    "28/2/2006"  2006 2 "COST" "COSTCO WHOLESALE CORP" 4277.534 14059.012  512.021 24040.615 536.998 127  .02981101 16860 1
    end
    The variables Date and CompanyID were created in order to be able to use properly xtset. CompanyID represents the company and Date the date in the format you can see above (I googled it because I used quarterly data and couldn't do it another way). The "fundamental" variables are Revenue, Return on Asset and Market Value, while the secondary variables are inventory, long-term debt and number of employees.

    The main focus of my thesis is to analyze the relationship between the "fundamental variables" and the "secondary variables. To do so I perform 3 regressions, making one of the fundamental variables dependent each time and all the other 5 remaining as independent. The problem the Hausman test on the 2/3 regressions give an error:

    Code:
    . hausman fe_result re_result
    
    Note: the rank of the differenced variance matrix (2) does not equal the number of coefficients being
            tested (5); be sure this is what you expect, or there may be problems computing the test.  Examine
            the output of your estimators for anything unexpected and possibly consider scaling your variables
            so that the coefficients are on a similar scale.
    
                     ---- Coefficients ----
                 |      (b)          (B)            (b-B)     sqrt(diag(V_b-V_B))
                 |   fe_result    re_result      Difference          S.E.
    -------------+----------------------------------------------------------------
              mv |    .0255729     .0218857        .0036872               .
             roa |    26564.38    -50143.92        76708.31               .
             inv |    1.009677     1.870039       -.8603618               .
          ltdebt |    .3181855     .2454759        .0727096               .
            empl |     48.5215     9.213474        39.30802        8.203313
    ------------------------------------------------------------------------------
                               b = consistent under Ho and Ha; obtained from xtreg
                B = inconsistent under Ha, efficient under Ho; obtained from xtreg
    
        Test:  Ho:  difference in coefficients not systematic
    
                      chi2(2) = (b-B)'[(V_b-V_B)^(-1)](b-B)
                              =   -10.39    chi2<0 ==> model fitted on these
                                            data fails to meet the asymptotic
                                            assumptions of the Hausman test;
                                            see suest for a generalized test


    Code:
    . hausman fe_result re_result
    
                     ---- Coefficients ----
                 |      (b)          (B)            (b-B)     sqrt(diag(V_b-V_B))
                 |   fe_result    re_result      Difference          S.E.
    -------------+----------------------------------------------------------------
              mv |    4.16e-07     5.99e-07       -1.83e-07        2.97e-08
             rev |    8.36e-07    -6.25e-07        1.46e-06        2.72e-07
             inv |   -2.81e-06    -1.49e-06       -1.32e-06        3.80e-07
          ltdebt |   -4.47e-07    -1.08e-07       -3.39e-07        1.08e-07
            empl |   -.0000399     4.70e-06       -.0000446        .0000508
    ------------------------------------------------------------------------------
                               b = consistent under Ho and Ha; obtained from xtreg
                B = inconsistent under Ha, efficient under Ho; obtained from xtreg
    
        Test:  Ho:  difference in coefficients not systematic
    
                      chi2(5) = (b-B)'[(V_b-V_B)^(-1)](b-B)
                              =       45.74
                    Prob>chi2 =      0.0000
                    (V_b-V_B is not positive definite)

    I used the dataex and the CODE delimeters but I'm not sure how better I could present my results since it is my first time posting.

    Thank you in advance for your time and effort. I would be grateful for any kind of help

  • #2
    Eddie:
    this is a frequent nuisance of -hausman-.
    You can test if the -re- specification is the way to for your data via the user-written command -xtoverid- (just type -search xtoverid- from within Stata to spot and install it).
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Thanks mr Carlo for your immediate response.

      I have already downloaded and tried the help xtoverid during the past days. The problem is that I don't understand how can I actually benefit from it since I don't have a good background in statistics. I only have the basic knowledge that someone gets in a Financial Management master.
      Another question I have is if there is any chance that my dataset is not proper and needs to be changed somehow

      Best regards,
      Eddie

      Comment


      • #4
        Eddie:
        see the following toy-example:
        Code:
        use "https://www.stata-press.com/data/r16/nlswork.dta"
        . xtreg ln_wage age , re
        
        Random-effects GLS regression                   Number of obs     =     28,510
        Group variable: idcode                          Number of groups  =      4,710
        
        R-sq:                                           Obs per group:
             within  = 0.1026                                         min =          1
             between = 0.0877                                         avg =        6.1
             overall = 0.0774                                         max =         15
        
                                                        Wald chi2(1)      =    3140.35
        corr(u_i, X)   = 0 (assumed)                    Prob > chi2       =     0.0000
        
        ------------------------------------------------------------------------------
             ln_wage |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
        -------------+----------------------------------------------------------------
                 age |   .0185667   .0003313    56.04   0.000     .0179174    .0192161
               _cons |   1.120439   .0112038   100.01   0.000      1.09848    1.142398
        -------------+----------------------------------------------------------------
             sigma_u |  .36972456
             sigma_e |  .30349389
                 rho |  .59743613   (fraction of variance due to u_i)
        ------------------------------------------------------------------------------
        
        . xtoverid
        
        Test of overidentifying restrictions: fixed vs random effects
        Cross-section time-series model: xtreg re  
        Sargan-Hansen statistic  17.401  Chi-sq(1)    P-value = 0.0000
        
        .
        -xtoverid- outcome points toward -fe- (because the null is, non-technically speaking, that -re- is the way to go).
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment


        • #5
          I tried to do what you posted but this message appears :

          Code:
          . xtoverid
          Error - must have ivreg2/ivreg29/ivreg28 version 2.1.15 or greater installed
          r(601);

          Comment


          • #6
            Eddie:
            just download one of the suggested command and re-run -xtoverid-.
            Kind regards,
            Carlo
            (Stata 19.0)

            Comment


            • #7
              As far as I understand these commands are about instrumental variables but I don't have any instrumental variables in my model. Do you imply that I should imlement some instrumental variables?

              The problem in my model all the variables are endogenous within companies (Revenue, Market Value, RoA, Inventory, Employees, Long-term Debt). How can I check the impact of these variables without having troubles with endogeneity or autocorrelation?

              Comment


              • #8
                Eddie:
                the way -xtoverid is conceived can handle an -hausman-- like test when with non-default standard errors.
                This does not imply an instrumental panel regression.
                I do not understand what you mean by "endoegenous within companies". Sorry for that.
                Kind regards,
                Carlo
                (Stata 19.0)

                Comment


                • #9
                  The first part of my question was about the commands that you proposed to download above (ivreg2/ivreg29/ivreg28). Don't they need instumental variables to proceed?

                  The second part of my question is that I am worried for autocorrelation of my variables. Do instrumental variables such as GDP or inflation might solve my problem?

                  I am sorry for my vocabulary but as I mentioned in my first post I don't have the best possible background on statistics.

                  Comment


                  • #10
                    Eddie:
                    in my reply #4 the toy-example included an -xtreg,fe- code then the use of -xtoverid-. There's no trace of instrumental variabe regression, though.
                    It's true that -xtoverid- needs instrumental variable-related commands created by Stata community to express all of its capabilities, but instrumental variable regerssion is not a prerequisite to exploit -xtoverid-.
                    If you're worried about autocorrelation and you have a short panel with 5 US firms as the cross-sectional dimension and a T dimension (years, to keep it simple) that is larger than N, you should simply invoke -robust- or -vce(cluster clusterid)- for your standard errors (they do the very same job under -xtreg-): this will take both heteroskedasticity and/or autocorrelation into account.
                    Then you can test via -xtoverid- which specification (-fe- or -re-) fits your data better.
                    Kind regards,
                    Carlo
                    (Stata 19.0)

                    Comment


                    • #11
                      So I guess you propose something like this?

                      Code:
                      . xtreg mv rev roa inv ltdebt empl, robust
                      
                      Random-effects GLS regression                   Number of obs     =        265
                      Group variable: CompanyID                       Number of groups  =          5
                      
                      R-sq:                                           Obs per group:
                           within  = 0.3885                                         min =         52
                           between = 0.9892                                         avg =       53.0
                           overall = 0.8671                                         max =         55
                      
                                                                      Wald chi2(4)      =          .
                      corr(u_i, X)   = 0 (assumed)                    Prob > chi2       =          .
                      
                                                    (Std. Err. adjusted for 5 clusters in CompanyID)
                      ------------------------------------------------------------------------------
                                   |               Robust
                                mv |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
                      -------------+----------------------------------------------------------------
                               rev |   .2455791   .3523976     0.70   0.486    -.4451074    .9362656
                               roa |   513917.5   145736.1     3.53   0.000       228280    799555.1
                               inv |   5.114071   1.741593     2.94   0.003     1.700611    8.527531
                            ltdebt |   .2030571   .4506348     0.45   0.652    -.6801708    1.086285
                              empl |  -15.19633   19.23774    -0.79   0.430    -52.90161    22.50896
                             _cons |  -13124.45   10221.78    -1.28   0.199    -33158.77    6909.872
                      -------------+----------------------------------------------------------------
                           sigma_u |          0
                           sigma_e |  25398.461
                               rho |          0   (fraction of variance due to u_i)
                      ------------------------------------------------------------------------------
                      
                      . 
                      end of do-file
                      
                      . xtoverid
                      Error - saved RE estimates are degenerate (sigma_u=0) and equivalent to pooled OLS
                      r(198);

                      Comment


                      • #12
                        Eddie:
                        Yes.
                        But in your example you do not have any panel-wise effect, as it is apparent from sigma_u=0 in your -xtreg,re- outcome table.
                        Kind regards,
                        Carlo
                        (Stata 19.0)

                        Comment


                        • #13
                          And this means I can use this model? These are the correct coefficients I should use?

                          Comment


                          • #14
                            Eddie:
                            no, it means that you have to switch to a pooled OLS.
                            Kind regards,
                            Carlo
                            (Stata 19.0)

                            Comment


                            • #15
                              So I use normal reg instead of xtreg right?

                              Comment

                              Working...
                              X