Problem with Hausman Test (already read 4 previous threads but still stuck)

Eddie Mateosian

Join Date: Jun 2020
Posts: 18

Problem with Hausman Test (already read 4 previous threads but still stuck)

04 Jun 2020, 11:16

First of all, I would like to say that this is my first post on this forum and that I appreciate the advice I have already used by reading previous threads in many cases.

I am currently writing my master thesis and I could say that I am a bit stuck with the empirical part due to the problems occur by the Hausman tests. I am using a panel data set consisted of the 5 biggest US retail companies:

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input str10 datadate int fiscalyear byte fiscalquarter str4 tickersymbol str21 companyname float(inv rev netinc mv ltdebt empl roa Date CompanyID)
"28/2/2005"  2005 2 "COST" "COSTCO WHOLESALE CORP" 4002.654 12658.077  498.605         . 732.589 110 .030312054 16495 1
"31/5/2005"  2005 3 "COST" "COSTCO WHOLESALE CORP" 4040.253  12006.21  708.393         . 715.448 110  .04221042 16587 1
"31/8/2005"  2005 4 "COST" "COSTCO WHOLESALE CORP" 4014.699 16709.936 1063.092         . 710.675 110  .06437659 16679 1
"30/11/2005" 2006 1 "COST" "COSTCO WHOLESALE CORP" 4825.284 12933.346  215.818  23767.61  546.82 127  .01241602 16770 1
"28/2/2006"  2006 2 "COST" "COSTCO WHOLESALE CORP" 4277.534 14059.012  512.021 24040.615 536.998 127  .02981101 16860 1
end

The variables Date and CompanyID were created in order to be able to use properly xtset. CompanyID represents the company and Date the date in the format you can see above (I googled it because I used quarterly data and couldn't do it another way). The "fundamental" variables are Revenue, Return on Asset and Market Value, while the secondary variables are inventory, long-term debt and number of employees.

The main focus of my thesis is to analyze the relationship between the "fundamental variables" and the "secondary variables. To do so I perform 3 regressions, making one of the fundamental variables dependent each time and all the other 5 remaining as independent. The problem the Hausman test on the 2/3 regressions give an error:

Code:

. hausman fe_result re_result

Note: the rank of the differenced variance matrix (2) does not equal the number of coefficients being
        tested (5); be sure this is what you expect, or there may be problems computing the test.  Examine
        the output of your estimators for anything unexpected and possibly consider scaling your variables
        so that the coefficients are on a similar scale.

                 ---- Coefficients ----
             |      (b)          (B)            (b-B)     sqrt(diag(V_b-V_B))
             |   fe_result    re_result      Difference          S.E.
-------------+----------------------------------------------------------------
          mv |    .0255729     .0218857        .0036872               .
         roa |    26564.38    -50143.92        76708.31               .
         inv |    1.009677     1.870039       -.8603618               .
      ltdebt |    .3181855     .2454759        .0727096               .
        empl |     48.5215     9.213474        39.30802        8.203313
------------------------------------------------------------------------------
                           b = consistent under Ho and Ha; obtained from xtreg
            B = inconsistent under Ha, efficient under Ho; obtained from xtreg

    Test:  Ho:  difference in coefficients not systematic

                  chi2(2) = (b-B)'[(V_b-V_B)^(-1)](b-B)
                          =   -10.39    chi2<0 ==> model fitted on these
                                        data fails to meet the asymptotic
                                        assumptions of the Hausman test;
                                        see suest for a generalized test

Code:

. hausman fe_result re_result

                 ---- Coefficients ----
             |      (b)          (B)            (b-B)     sqrt(diag(V_b-V_B))
             |   fe_result    re_result      Difference          S.E.
-------------+----------------------------------------------------------------
          mv |    4.16e-07     5.99e-07       -1.83e-07        2.97e-08
         rev |    8.36e-07    -6.25e-07        1.46e-06        2.72e-07
         inv |   -2.81e-06    -1.49e-06       -1.32e-06        3.80e-07
      ltdebt |   -4.47e-07    -1.08e-07       -3.39e-07        1.08e-07
        empl |   -.0000399     4.70e-06       -.0000446        .0000508
------------------------------------------------------------------------------
                           b = consistent under Ho and Ha; obtained from xtreg
            B = inconsistent under Ha, efficient under Ho; obtained from xtreg

    Test:  Ho:  difference in coefficients not systematic

                  chi2(5) = (b-B)'[(V_b-V_B)^(-1)](b-B)
                          =       45.74
                Prob>chi2 =      0.0000
                (V_b-V_B is not positive definite)

I used the dataex and the CODE delimeters but I'm not sure how better I could present my results since it is my first time posting.

Thank you in advance for your time and effort. I would be grateful for any kind of help

Tags: None

Carlo Lazzaro

Join Date: Apr 2014

Posts: 17707
#2

04 Jun 2020, 11:24

Eddie:
this is a frequent nuisance of -hausman-.
You can test if the -re- specification is the way to for your data via the user-written command -xtoverid- (just type -search xtoverid- from within Stata to spot and install it).

Kind regards,
Carlo
(Stata 19.0)
Comment
Eddie Mateosian

Join Date: Jun 2020

Posts: 18
#3

04 Jun 2020, 11:36

Thanks mr Carlo for your immediate response.

I have already downloaded and tried the help xtoverid during the past days. The problem is that I don't understand how can I actually benefit from it since I don't have a good background in statistics. I only have the basic knowledge that someone gets in a Financial Management master.
Another question I have is if there is any chance that my dataset is not proper and needs to be changed somehow

Best regards,
Eddie
Comment

Carlo Lazzaro

Join Date: Apr 2014
Posts: 17707

04 Jun 2020, 12:20

Eddie:
see the following toy-example:

Code:

use "https://www.stata-press.com/data/r16/nlswork.dta"
. xtreg ln_wage age , re

Random-effects GLS regression                   Number of obs     =     28,510
Group variable: idcode                          Number of groups  =      4,710

R-sq:                                           Obs per group:
     within  = 0.1026                                         min =          1
     between = 0.0877                                         avg =        6.1
     overall = 0.0774                                         max =         15

                                                Wald chi2(1)      =    3140.35
corr(u_i, X)   = 0 (assumed)                    Prob > chi2       =     0.0000

------------------------------------------------------------------------------
     ln_wage |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         age |   .0185667   .0003313    56.04   0.000     .0179174    .0192161
       _cons |   1.120439   .0112038   100.01   0.000      1.09848    1.142398
-------------+----------------------------------------------------------------
     sigma_u |  .36972456
     sigma_e |  .30349389
         rho |  .59743613   (fraction of variance due to u_i)
------------------------------------------------------------------------------

. xtoverid

Test of overidentifying restrictions: fixed vs random effects
Cross-section time-series model: xtreg re  
Sargan-Hansen statistic  17.401  Chi-sq(1)    P-value = 0.0000

.

-xtoverid- outcome points toward -fe- (because the null is, non-technically speaking, that -re- is the way to go).

Kind regards,
Carlo
(Stata 19.0)

Comment

Eddie Mateosian

Join Date: Jun 2020

Posts: 18
#5

04 Jun 2020, 12:24

I tried to do what you posted but this message appears :

Code:

. xtoverid Error - must have ivreg2/ivreg29/ivreg28 version 2.1.15 or greater installed r(601);
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17707
#6

04 Jun 2020, 13:35

Eddie:
just download one of the suggested command and re-run -xtoverid-.

Kind regards,
Carlo
(Stata 19.0)
Comment
Eddie Mateosian

Join Date: Jun 2020

Posts: 18
#7

05 Jun 2020, 04:52

As far as I understand these commands are about instrumental variables but I don't have any instrumental variables in my model. Do you imply that I should imlement some instrumental variables?

The problem in my model all the variables are endogenous within companies (Revenue, Market Value, RoA, Inventory, Employees, Long-term Debt). How can I check the impact of these variables without having troubles with endogeneity or autocorrelation?
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17707
#8

05 Jun 2020, 07:11

Eddie:
the way -xtoverid is conceived can handle an -hausman-- like test when with non-default standard errors.
This does not imply an instrumental panel regression.
I do not understand what you mean by "endoegenous within companies". Sorry for that.

Kind regards,
Carlo
(Stata 19.0)
Comment
Eddie Mateosian

Join Date: Jun 2020

Posts: 18
#9

05 Jun 2020, 11:05

The first part of my question was about the commands that you proposed to download above (ivreg2/ivreg29/ivreg28). Don't they need instumental variables to proceed?

The second part of my question is that I am worried for autocorrelation of my variables. Do instrumental variables such as GDP or inflation might solve my problem?

I am sorry for my vocabulary but as I mentioned in my first post I don't have the best possible background on statistics.
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17707
#10

05 Jun 2020, 11:32

Eddie:
in my reply #4 the toy-example included an -xtreg,fe- code then the use of -xtoverid-. There's no trace of instrumental variabe regression, though.
It's true that -xtoverid- needs instrumental variable-related commands created by Stata community to express all of its capabilities, but instrumental variable regerssion is not a prerequisite to exploit -xtoverid-.
If you're worried about autocorrelation and you have a short panel with 5 US firms as the cross-sectional dimension and a T dimension (years, to keep it simple) that is larger than N, you should simply invoke -robust- or -vce(cluster clusterid)- for your standard errors (they do the very same job under -xtreg-): this will take both heteroskedasticity and/or autocorrelation into account.
Then you can test via -xtoverid- which specification (-fe- or -re-) fits your data better.

Kind regards,
Carlo
(Stata 19.0)
Comment

Eddie Mateosian

Join Date: Jun 2020
Posts: 18

#11

05 Jun 2020, 11:51

So I guess you propose something like this?

Code:

. xtreg mv rev roa inv ltdebt empl, robust

Random-effects GLS regression                   Number of obs     =        265
Group variable: CompanyID                       Number of groups  =          5

R-sq:                                           Obs per group:
     within  = 0.3885                                         min =         52
     between = 0.9892                                         avg =       53.0
     overall = 0.8671                                         max =         55

                                                Wald chi2(4)      =          .
corr(u_i, X)   = 0 (assumed)                    Prob > chi2       =          .

                              (Std. Err. adjusted for 5 clusters in CompanyID)
------------------------------------------------------------------------------
             |               Robust
          mv |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         rev |   .2455791   .3523976     0.70   0.486    -.4451074    .9362656
         roa |   513917.5   145736.1     3.53   0.000       228280    799555.1
         inv |   5.114071   1.741593     2.94   0.003     1.700611    8.527531
      ltdebt |   .2030571   .4506348     0.45   0.652    -.6801708    1.086285
        empl |  -15.19633   19.23774    -0.79   0.430    -52.90161    22.50896
       _cons |  -13124.45   10221.78    -1.28   0.199    -33158.77    6909.872
-------------+----------------------------------------------------------------
     sigma_u |          0
     sigma_e |  25398.461
         rho |          0   (fraction of variance due to u_i)
------------------------------------------------------------------------------

. 
end of do-file

. xtoverid
Error - saved RE estimates are degenerate (sigma_u=0) and equivalent to pooled OLS
r(198);

Comment

Carlo Lazzaro

Join Date: Apr 2014

Posts: 17707
#12

05 Jun 2020, 12:17

Eddie:
Yes.
But in your example you do not have any panel-wise effect, as it is apparent from sigma_u=0 in your -xtreg,re- outcome table.

Kind regards,
Carlo
(Stata 19.0)
Comment
Eddie Mateosian

Join Date: Jun 2020

Posts: 18
#13

05 Jun 2020, 12:25

And this means I can use this model? These are the correct coefficients I should use?
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17707
#14

05 Jun 2020, 13:47

Eddie:
no, it means that you have to switch to a pooled OLS.

Kind regards,
Carlo
(Stata 19.0)
Comment
Eddie Mateosian

Join Date: Jun 2020

Posts: 18
#15

05 Jun 2020, 17:51

So I use normal reg instead of xtreg right?
Comment

Announcement

Problem with Hausman Test (already read 4 previous threads but still stuck)

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment