Hausman Test - different results FE and RE model

Louisa Krekel

Join Date: Jul 2015

Posts: 33
#1

Hausman Test - different results FE and RE model

20 Aug 2015, 01:59

Hi everyone,

I investigate the effect of advertising bans on tobacco consumption, thus my dependent variable is tobacco consumption (logcons) and my explanatory variables are advertising ban dummies (weak, limited and comprehensive – only including limited “lim” and comprehensive “comp” due to multicollinearity). My control variables are price (logprice), income (loggdp) and unemployment rate (logunemp).

I’m estimating the model using a FE model and a RE model, but I have difficulties deciding which model might be better
When including all variables in the model the hausman test suggests using a FE model

. *test FE vs RE - Hausmann Test
. quietly xtreg logcons logprice logunemp loggdp lim compr, fe

. est store fixed

. quietly xtreg logcons logprice logunemp loggdp lim compr, re

. est store random

. hausman fixed random, sigmamore

---- Coefficients ----
| (b) (B) (b-B) sqrt(diag(V_b-V_B))
| fixed random Difference S.E.
-------------+----------------------------------------------------------------
logprice | -.2793116 -.292519 .0132073 .0050106
logunemp | -.0708074 -.0643395 -.0064679 .0021851
loggdp | -.4403702 -.3967151 -.0436551 .0137747
lim | .0201361 .0147369 .0053992 .0020572
compr | -.0198701 -.0286352 .0087651 .0034372
------------------------------------------------------------------------------
b = consistent under Ho and Ha; obtained from xtreg
B = inconsistent under Ha, efficient under Ho; obtained from xtreg

Test: Ho: difference in coefficients not systematic

chi2(5) = (b-B)'[(V_b-V_B)^(-1)](b-B)
= 12.57
Prob>chi2 = 0.0278

But when only using the ban variables “lim” and “compr” and the control variable “logprice” as the other variables are not significant in either model the hausman test suggests going with RE model

. *test FE vs RE - Hausmann Test (without unemployment and gdp)
. quietly xtreg logcons logprice lim compr, fe

. est store fixed

. quietly xtreg logcons logprice lim compr, re

. est store random

. hausman fixed random, sigmamore

---- Coefficients ----
| (b) (B) (b-B) sqrt(diag(V_b-V_B))
| fixed random Difference S.E.
-------------+----------------------------------------------------------------
logprice | -.4344597 -.4327692 -.0016904 .0019615
lim | -.0238279 -.0242847 .0004567 .0010905
compr | -.0999757 -.0999975 .0000218 .0016702
------------------------------------------------------------------------------
b = consistent under Ho and Ha; obtained from xtreg
B = inconsistent under Ha, efficient under Ho; obtained from xtreg

Test: Ho: difference in coefficients not systematic

chi2(3) = (b-B)'[(V_b-V_B)^(-1)](b-B)
= 0.93
Prob>chi2 = 0.8179

.
end of do-file

I would really appreciate any commets or help on this topic. I know that the hausman test is only valid under homoskedasticity and cannot include time fixed effects (which I included in both models and are significant).
Is there another test which might be more appropriate? Could it be reasonable going with a RE model?
I think that the model suffers from omitted variable bias as I cannot include a variable like "attitude towards helath" , "public image of smoking" or "social acceptance" which might be the main drivers in that model . But all of these unobserved variables change across time and thus a FE model is not much of a great help. Or am I wrong?

Thanks a lot

Best regards
Louisa
Tags: None
Eric de Souza

Join Date: Mar 2014

Posts: 587
#2

20 Aug 2015, 02:57

Type -ssc describe xtoverid- is Stata to get a description of a test that can also be used with robust and cluster options
You can then install it with -ssc install xtoverid-
Then -help xtoverid-
Comment
Louisa Krekel

Join Date: Jul 2015

Posts: 33
#3

20 Aug 2015, 03:09

Thanks a lot, Eric!

I did use the xtoverid command after the RE model.

First, including all variables in the model

quietly xtreg logcons logprice logunemp loggdp lim compr $t, re cluster (Country)

. xtoverid

Test of overidentifying restrictions: fixed vs random effects
Cross-section time-series model: xtreg re robust cluster(Country)
Sargan-Hansen statistic 2.1e+04 Chi-sq(15) P-value = 0.0000

and second leaving GDP and unemployment rate out.

quietly xtreg logcons logprice lim compr $t, re cluster (Country)

. xtoverid

Test of overidentifying restrictions: fixed vs random effects
Cross-section time-series model: xtreg re robust cluster(Country)
Sargan-Hansen statistic 2.8e+05 Chi-sq(13) P-value = 0.0000

Here, both test suggest going with the FE model, right?

Last edited by Louisa Krekel; 20 Aug 2015, 03:11.
Comment
Eric de Souza

Join Date: Mar 2014

Posts: 587
#4

20 Aug 2015, 03:16

Correct. The extra restrictions imposed by RE are rejected. But the very high value of the Sargan Hansen test worries me. I won't be checking back again today
Comment
Louisa Krekel

Join Date: Jul 2015

Posts: 33
#5

20 Aug 2015, 03:21

Thanks again, Eric!
Anyone else could help here?
Comment
Williams Ahouakan

Join Date: Mar 2015

Posts: 32
#6

20 Aug 2015, 05:47

If the theory tell you that you have some omitted variables in your model which are able to cause endogeneity problems, I think that FE model will be more suitable than RE model whatever the result of your Hausman test! Baum (2006) mentioned that hausman tests can give conflicting results. But in your case, I think that the difference in the results is linked to the difference in models specification. I think FE model can be suitable if your interest is to solve endogenity problems. Indeed the FE model can be considered as a variant of the IV model where you use (X - mean of X) as an instrument for X.
In doing so you are sure that you have a strong instrument as (X- mean of X) is strongly correlate with X, i.e. your independent variables.
I hope this will help you!
Comment
Louisa Krekel

Join Date: Jul 2015

Posts: 33
#7

20 Aug 2015, 08:21

Thank you Williams for your answer.
I also think that the FE model might be more suitable than the RE model, especially when I include time dummies in the model, that control for variables that change over time, like "social acceptane" and "attitude towards health". Would you agree? Thanks again, really appreciate your help a lot!
Comment
Eric de Souza

Join Date: Mar 2014

Posts: 587
#8

20 Aug 2015, 09:09

It would be useful to know how many observations you have in the two dimensions of your panel data,N (number of countries) and T (number of time periods)
Comment
Louisa Krekel

Join Date: Jul 2015

Posts: 33
#9

20 Aug 2015, 09:13

I investigate 29 OECD countries over 22 years (1990 - 2012).

So, N=29 and T=23
Comment
Eric de Souza

Join Date: Mar 2014

Posts: 587
#10

20 Aug 2015, 09:26

Then since N and T are about the same size, you should compare the standard errors of the coefficients with and without cluster, and also run the xtoverid test with and without cluster. Leave the time trend in. (I suppose that $t is the time trend)
Comment
Williams Ahouakan

Join Date: Mar 2015

Posts: 32
#11

20 Aug 2015, 09:29

I think yes! You can control for time effects whenever you think that unexpected variations or special events might affect the outcome variable. You can also use the command testparm after the estimation of your FE model to test whether time fixed effects are needed.
A another way to see if your FE model is suitable is to do descriptive stats (command xtsum) and to compare the Whithin R2 to the between R2. If the Within R2 are generally superior to the Between R2 for the whole of your variables, then you will be more confortable in using FE model.
Comment
Louisa Krekel

Join Date: Jul 2015

Posts: 33
#12

20 Aug 2015, 09:39

First FE-regression:

xtreg logcons logprice logunemp loggdp lim compr $t, fe vce (cluster Country)

Fixed-effects (within) regression Number of obs = 586
Group variable: Country Number of groups = 28

R-sq: within = 0.6744 Obs per group: min = 16
between = 0.0285 avg = 20.9
overall = 0.1670 max = 23

F(27,27) = 1467.35
corr(u_i, Xb) = 0.0091 Prob > F = 0.0000

(Std. Err. adjusted for 28 clusters in Country)

Robust
logcons Coef. Std. Err. t P>t [95% Conf. Interval]

logprice -.1521436 .0666861 -2.28 0.031 -.2889723 -.0153149
logunemp -.0096248 .0407143 -0.24 0.815 -.0931637 .0739142
loggdp -.0941326 .1682284 -0.56 0.580 -.4393088 .2510437
lim .0413048 .026538 1.56 0.131 -.0131466 .0957562
compr .0454582 .0516025 0.88 0.386 -.0604215 .1513378
year1991 -.0021066 .0233475 -0.09 0.929 -.0500116 .0457985
year1992 -.0421963 .027693 -1.52 0.139 -.0990177 .0146251
year1993 -.1025348 .0426624 -2.40 0.023 -.1900708 -.0149989
year1994 -.0834783 .0360658 -2.31 0.028 -.1574792 -.0094774
year1995 -.081616 .0432097 -1.89 0.070 -.170275 .007043
year1996 -.0999646 .0480262 -2.08 0.047 -.1985063 -.0014229
year1997 -.0979546 .0500731 -1.96 0.061 -.2006961 .0047868
year1998 -.1138117 .0556946 -2.04 0.051 -.2280875 .0004642
year1999 -.108034 .0567961 -1.90 0.068 -.2245699 .008502
year2000 -.1118515 .0648817 -1.72 0.096 -.2449777 .0212746
year2001 -.1351055 .0681635 -1.98 0.058 -.2749655 .0047546
year2002 -.1212705 .0730016 -1.66 0.108 -.2710574 .0285163
year2003 -.1529623 .0748291 -2.04 0.051 -.3064989 .0005743
year2004 -.174586 .0774919 -2.25 0.033 -.3335863 -.0155857
year2005 -.2178057 .0778478 -2.80 0.009 -.3775362 -.0580753
year2006 -.2382353 .0810007 -2.94 0.007 -.4044349 -.0720357
year2007 -.271323 .0775241 -3.50 0.002 -.4303894 -.1122566
year2008 -.3009147 .0776379 -3.88 0.001 -.4602145 -.1416149
year2009 -.3385844 .0839133 -4.03 0.000 -.5107602 -.1664086
year2010 -.3611086 .0896201 -4.03 0.000 -.5449938 -.1772234
year2011 -.3837863 .0915944 -4.19 0.000 -.5717225 -.1958501
year2012 -.4714311 .0940788 -5.01 0.000 -.6644648 -.2783975
_cons 8.801786 1.694151 5.20 0.000 5.325675 12.2779

sigma_u .37388657
sigma_e .10829839
rho .92259396 (fraction of variance due to u_i)

.
end of do-file

and the RE regression

xtreg logcons logprice logunemp loggdp lim compr $t, re vce (cluster Country) theta

Random-effects GLS regression Number of obs = 586
Group variable: Country Number of groups = 28

R-sq: within = 0.6739 Obs per group: min = 16
between = 0.0798 avg = 20.9
overall = 0.1937 max = 23

Wald chi2(27) = 39678.48
corr(u_i, X) = 0 (assumed) Prob > chi2 = 0.0000

theta --------------------
min 5% median 95% max
0.8992 0.9022 0.9140 0.9158 0.9158

(Std. Err. adjusted for 28 clusters in Country)

Robust
logcons Coef. Std. Err. z P>z [95% Conf. Interval]

logprice -.1626189 .0661288 -2.46 0.014 -.2922289 -.0330089
logunemp -.0026615 .0390791 -0.07 0.946 -.0792551 .0739322
loggdp -.0366685 .1652797 -0.22 0.824 -.3606107 .2872737
lim .036683 .0262169 1.40 0.162 -.0147011 .0880672
compr .0379836 .0510364 0.74 0.457 -.0620459 .1380131
year1991 -.0020122 .0232742 -0.09 0.931 -.0476288 .0436044
year1992 -.0436133 .0268978 -1.62 0.105 -.0963319 .0091054
year1993 -.1040575 .0418759 -2.48 0.013 -.1861327 -.0219823
year1994 -.0851711 .034798 -2.45 0.014 -.1533739 -.0169684
year1995 -.0852297 .0415686 -2.05 0.040 -.1667026 -.0037568
year1996 -.1047196 .0468804 -2.23 0.025 -.1966036 -.0128357
year1997 -.1037734 .0489507 -2.12 0.034 -.199715 -.0078319
year1998 -.120722 .0534597 -2.26 0.024 -.2255011 -.0159428
year1999 -.1155815 .0541096 -2.14 0.033 -.2216344 -.0095285
year2000 -.1210982 .0610069 -1.98 0.047 -.2406695 -.001527
year2001 -.1443116 .0636303 -2.27 0.023 -.2690247 -.0195985
year2002 -.1309832 .0675934 -1.94 0.053 -.2634637 .0014973
year2003 -.1630397 .0689482 -2.36 0.018 -.2981757 -.0279037
year2004 -.1855205 .0718282 -2.58 0.010 -.3263012 -.0447398
year2005 -.22868 .0714494 -3.20 0.001 -.3687181 -.0886418
year2006 -.2505803 .0756864 -3.31 0.001 -.3989228 -.1022377
year2007 -.2839576 .0739453 -3.84 0.000 -.4288878 -.1390274
year2008 -.3135869 .0719444 -4.36 0.000 -.4545954 -.1725784
year2009 -.3513294 .0768148 -4.57 0.000 -.5018836 -.2007752
year2010 -.374602 .082132 -4.56 0.000 -.5355778 -.2136262
year2011 -.3981123 .0913657 -4.36 0.000 -.5771859 -.2190387
year2012 -.4867316 .0898576 -5.42 0.000 -.6628492 -.310614
_cons 8.220524 1.679469 4.89 0.000 4.928826 11.51222

sigma_u .26734342
sigma_e .10829839
rho .85903373 (fraction of variance due to u_i)

.
end of do-file

From what I can see, the SE do not differ much.

And now the Sargan Hansen test with cluster

quietly xtreg logcons logprice logunemp loggdp lim compr $t, re cluster (Country)

. xtoverid

Test of overidentifying restrictions: fixed vs random effects
Cross-section time-series model: xtreg re robust cluster(Country)
Sargan-Hansen statistic 2.1e+04 Chi-sq(15) P-value = 0.0000

.
end of do-file

And now without cluster

. quietly xtreg logcons logprice logunemp loggdp lim compr $t, re

.
. xtoverid

Test of overidentifying restrictions: fixed vs random effects
Cross-section time-series model: xtreg re
Sargan-Hansen statistic 39.880 Chi-sq(15) P-value = 0.0005

Both test recommend going with the FE model, right?
Thanks a lot Eric! As I am new to Stata I'm really grateful for any help.
Comment
Marcos Almeida

Join Date: Apr 2014

Posts: 4047
#13

20 Aug 2015, 09:46

This is just a personal view. What the Hausman test does is, generally, well stated in the output (it is a matter of selecting a consistent estimator overall, or an efficient estimator). It shouldn't be of much help in other situations. In other words, I fear it is not quite useful to judge the appropriateness of much different models on account of the results of the Hausman test. Basically, these paradoxical results of the Hausman test may be related to the level of endogeneity involving some of the predictors. I believe we should first select the model according to the rationale. Also, we could perform some sort of modeling (adding or excluding variables) followed by post estimations. At this point, a test if a RE or a FE alternative would likely apply. That said, there is even the case where we may choose the FE due to the theoretical background, a given particularity of the field or the main aim, rather than put much weight on the decision according to the result of a single test which, naturally, is subjected to criticisms, pitfalls and limitations.

Best regards,

Marcos
Comment
Eric de Souza

Join Date: Mar 2014

Posts: 587
#14

20 Aug 2015, 09:56

There is a reason for choosing the FE: the country effects are most likely correlated with the dependent variables in your case. What Marcos calls the theoretical background. I would stay with that
Comment
Williams Ahouakan

Join Date: Mar 2015

Posts: 32
#15

20 Aug 2015, 10:07

Yes! I also think that FE is more suitable.

There is again another possibility. I have tried it in one paper I’m still writing. But It depend on the size of your sample. I think that it could give you both convergent (i.e. the within estimators) and efficient estimators. This method was elaborated by Mundlak. The stata command which was programed by Green and al. is also available (ssc install mundlak)
Comment

Announcement

Hausman Test - different results FE and RE model

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment