Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Hausman Test - different results FE and RE model

    Hi everyone,


    I investigate the effect of advertising bans on tobacco consumption, thus my dependent variable is tobacco consumption (logcons) and my explanatory variables are advertising ban dummies (weak, limited and comprehensive – only including limited “lim” and comprehensive “comp” due to multicollinearity). My control variables are price (logprice), income (loggdp) and unemployment rate (logunemp).


    I’m estimating the model using a FE model and a RE model, but I have difficulties deciding which model might be better
    When including all variables in the model the hausman test suggests using a FE model


    . *test FE vs RE - Hausmann Test
    . quietly xtreg logcons logprice logunemp loggdp lim compr, fe

    . est store fixed

    . quietly xtreg logcons logprice logunemp loggdp lim compr, re

    . est store random

    . hausman fixed random, sigmamore

    ---- Coefficients ----
    | (b) (B) (b-B) sqrt(diag(V_b-V_B))
    | fixed random Difference S.E.
    -------------+----------------------------------------------------------------
    logprice | -.2793116 -.292519 .0132073 .0050106
    logunemp | -.0708074 -.0643395 -.0064679 .0021851
    loggdp | -.4403702 -.3967151 -.0436551 .0137747
    lim | .0201361 .0147369 .0053992 .0020572
    compr | -.0198701 -.0286352 .0087651 .0034372
    ------------------------------------------------------------------------------
    b = consistent under Ho and Ha; obtained from xtreg
    B = inconsistent under Ha, efficient under Ho; obtained from xtreg

    Test: Ho: difference in coefficients not systematic

    chi2(5) = (b-B)'[(V_b-V_B)^(-1)](b-B)
    = 12.57
    Prob>chi2 = 0.0278



    But when only using the ban variables “lim” and “compr” and the control variable “logprice” as the other variables are not significant in either model the hausman test suggests going with RE model


    . *test FE vs RE - Hausmann Test (without unemployment and gdp)
    . quietly xtreg logcons logprice lim compr, fe

    . est store fixed

    . quietly xtreg logcons logprice lim compr, re

    . est store random

    . hausman fixed random, sigmamore

    ---- Coefficients ----
    | (b) (B) (b-B) sqrt(diag(V_b-V_B))
    | fixed random Difference S.E.
    -------------+----------------------------------------------------------------
    logprice | -.4344597 -.4327692 -.0016904 .0019615
    lim | -.0238279 -.0242847 .0004567 .0010905
    compr | -.0999757 -.0999975 .0000218 .0016702
    ------------------------------------------------------------------------------
    b = consistent under Ho and Ha; obtained from xtreg
    B = inconsistent under Ha, efficient under Ho; obtained from xtreg

    Test: Ho: difference in coefficients not systematic

    chi2(3) = (b-B)'[(V_b-V_B)^(-1)](b-B)
    = 0.93
    Prob>chi2 = 0.8179

    .
    end of do-file


    I would really appreciate any commets or help on this topic. I know that the hausman test is only valid under homoskedasticity and cannot include time fixed effects (which I included in both models and are significant).
    Is there another test which might be more appropriate? Could it be reasonable going with a RE model?
    I think that the model suffers from omitted variable bias as I cannot include a variable like "attitude towards helath" , "public image of smoking" or "social acceptance" which might be the main drivers in that model . But all of these unobserved variables change across time and thus a FE model is not much of a great help. Or am I wrong?

    Thanks a lot

    Best regards
    Louisa

  • #2
    Type -ssc describe xtoverid- is Stata to get a description of a test that can also be used with robust and cluster options
    You can then install it with -ssc install xtoverid-
    Then -help xtoverid-

    Comment


    • #3
      Thanks a lot, Eric!

      I did use the xtoverid command after the RE model.

      First, including all variables in the model

      quietly xtreg logcons logprice logunemp loggdp lim compr $t, re cluster (Country)

      . xtoverid

      Test of overidentifying restrictions: fixed vs random effects
      Cross-section time-series model: xtreg re robust cluster(Country)
      Sargan-Hansen statistic 2.1e+04 Chi-sq(15) P-value = 0.0000


      and second leaving GDP and unemployment rate out.


      quietly xtreg logcons logprice lim compr $t, re cluster (Country)

      . xtoverid

      Test of overidentifying restrictions: fixed vs random effects
      Cross-section time-series model: xtreg re robust cluster(Country)
      Sargan-Hansen statistic 2.8e+05 Chi-sq(13) P-value = 0.0000

      Here, both test suggest going with the FE model, right?



      Last edited by Louisa Krekel; 20 Aug 2015, 03:11.

      Comment


      • #4
        Correct. The extra restrictions imposed by RE are rejected. But the very high value of the Sargan Hansen test worries me. I won't be checking back again today

        Comment


        • #5
          Thanks again, Eric!
          Anyone else could help here?

          Comment


          • #6
            If the theory tell you that you have some omitted variables in your model which are able to cause endogeneity problems, I think that FE model will be more suitable than RE model whatever the result of your Hausman test! Baum (2006) mentioned that hausman tests can give conflicting results. But in your case, I think that the difference in the results is linked to the difference in models specification. I think FE model can be suitable if your interest is to solve endogenity problems. Indeed the FE model can be considered as a variant of the IV model where you use (X - mean of X) as an instrument for X.
            In doing so you are sure that you have a strong instrument as (X- mean of X) is strongly correlate with X, i.e. your independent variables.
            ​I hope this will help you!

            Comment


            • #7
              Thank you Williams for your answer.
              I also think that the FE model might be more suitable than the RE model, especially when I include time dummies in the model, that control for variables that change over time, like "social acceptane" and "attitude towards health". Would you agree? Thanks again, really appreciate your help a lot!

              Comment


              • #8
                It would be useful to know how many observations you have in the two dimensions of your panel data,N (number of countries) and T (number of time periods)

                Comment


                • #9
                  I investigate 29 OECD countries over 22 years (1990 - 2012).

                  So, N=29 and T=23

                  Comment


                  • #10
                    Then since N and T are about the same size, you should compare the standard errors of the coefficients with and without cluster, and also run the xtoverid test with and without cluster. Leave the time trend in. (I suppose that $t is the time trend)

                    Comment


                    • #11
                      I think yes! You can control for time effects whenever you think that unexpected variations or special events might affect the outcome variable. You can also use the command testparm after the estimation of your FE model to test whether time fixed effects are needed.
                      A another way to see if your FE model is suitable is to do descriptive stats (command xtsum) and to compare the Whithin R2 to the between R2. If the Within R2 are generally superior to the Between R2 for the whole of your variables, then you will be more confortable in using FE model.

                      Comment


                      • #12
                        First FE-regression:

                        xtreg logcons logprice logunemp loggdp lim compr $t, fe vce (cluster Country)

                        Fixed-effects (within) regression Number of obs = 586
                        Group variable: Country Number of groups = 28

                        R-sq: within = 0.6744 Obs per group: min = 16
                        between = 0.0285 avg = 20.9
                        overall = 0.1670 max = 23

                        F(27,27) = 1467.35
                        corr(u_i, Xb) = 0.0091 Prob > F = 0.0000

                        (Std. Err. adjusted for 28 clusters in Country)

                        Robust
                        logcons Coef. Std. Err. t P>t [95% Conf. Interval]

                        logprice -.1521436 .0666861 -2.28 0.031 -.2889723 -.0153149
                        logunemp -.0096248 .0407143 -0.24 0.815 -.0931637 .0739142
                        loggdp -.0941326 .1682284 -0.56 0.580 -.4393088 .2510437
                        lim .0413048 .026538 1.56 0.131 -.0131466 .0957562
                        compr .0454582 .0516025 0.88 0.386 -.0604215 .1513378
                        year1991 -.0021066 .0233475 -0.09 0.929 -.0500116 .0457985
                        year1992 -.0421963 .027693 -1.52 0.139 -.0990177 .0146251
                        year1993 -.1025348 .0426624 -2.40 0.023 -.1900708 -.0149989
                        year1994 -.0834783 .0360658 -2.31 0.028 -.1574792 -.0094774
                        year1995 -.081616 .0432097 -1.89 0.070 -.170275 .007043
                        year1996 -.0999646 .0480262 -2.08 0.047 -.1985063 -.0014229
                        year1997 -.0979546 .0500731 -1.96 0.061 -.2006961 .0047868
                        year1998 -.1138117 .0556946 -2.04 0.051 -.2280875 .0004642
                        year1999 -.108034 .0567961 -1.90 0.068 -.2245699 .008502
                        year2000 -.1118515 .0648817 -1.72 0.096 -.2449777 .0212746
                        year2001 -.1351055 .0681635 -1.98 0.058 -.2749655 .0047546
                        year2002 -.1212705 .0730016 -1.66 0.108 -.2710574 .0285163
                        year2003 -.1529623 .0748291 -2.04 0.051 -.3064989 .0005743
                        year2004 -.174586 .0774919 -2.25 0.033 -.3335863 -.0155857
                        year2005 -.2178057 .0778478 -2.80 0.009 -.3775362 -.0580753
                        year2006 -.2382353 .0810007 -2.94 0.007 -.4044349 -.0720357
                        year2007 -.271323 .0775241 -3.50 0.002 -.4303894 -.1122566
                        year2008 -.3009147 .0776379 -3.88 0.001 -.4602145 -.1416149
                        year2009 -.3385844 .0839133 -4.03 0.000 -.5107602 -.1664086
                        year2010 -.3611086 .0896201 -4.03 0.000 -.5449938 -.1772234
                        year2011 -.3837863 .0915944 -4.19 0.000 -.5717225 -.1958501
                        year2012 -.4714311 .0940788 -5.01 0.000 -.6644648 -.2783975
                        _cons 8.801786 1.694151 5.20 0.000 5.325675 12.2779

                        sigma_u .37388657
                        sigma_e .10829839
                        rho .92259396 (fraction of variance due to u_i)


                        .
                        end of do-file


                        and the RE regression

                        xtreg logcons logprice logunemp loggdp lim compr $t, re vce (cluster Country) theta

                        Random-effects GLS regression Number of obs = 586
                        Group variable: Country Number of groups = 28

                        R-sq: within = 0.6739 Obs per group: min = 16
                        between = 0.0798 avg = 20.9
                        overall = 0.1937 max = 23

                        Wald chi2(27) = 39678.48
                        corr(u_i, X) = 0 (assumed) Prob > chi2 = 0.0000

                        theta --------------------
                        min 5% median 95% max
                        0.8992 0.9022 0.9140 0.9158 0.9158

                        (Std. Err. adjusted for 28 clusters in Country)

                        Robust
                        logcons Coef. Std. Err. z P>z [95% Conf. Interval]

                        logprice -.1626189 .0661288 -2.46 0.014 -.2922289 -.0330089
                        logunemp -.0026615 .0390791 -0.07 0.946 -.0792551 .0739322
                        loggdp -.0366685 .1652797 -0.22 0.824 -.3606107 .2872737
                        lim .036683 .0262169 1.40 0.162 -.0147011 .0880672
                        compr .0379836 .0510364 0.74 0.457 -.0620459 .1380131
                        year1991 -.0020122 .0232742 -0.09 0.931 -.0476288 .0436044
                        year1992 -.0436133 .0268978 -1.62 0.105 -.0963319 .0091054
                        year1993 -.1040575 .0418759 -2.48 0.013 -.1861327 -.0219823
                        year1994 -.0851711 .034798 -2.45 0.014 -.1533739 -.0169684
                        year1995 -.0852297 .0415686 -2.05 0.040 -.1667026 -.0037568
                        year1996 -.1047196 .0468804 -2.23 0.025 -.1966036 -.0128357
                        year1997 -.1037734 .0489507 -2.12 0.034 -.199715 -.0078319
                        year1998 -.120722 .0534597 -2.26 0.024 -.2255011 -.0159428
                        year1999 -.1155815 .0541096 -2.14 0.033 -.2216344 -.0095285
                        year2000 -.1210982 .0610069 -1.98 0.047 -.2406695 -.001527
                        year2001 -.1443116 .0636303 -2.27 0.023 -.2690247 -.0195985
                        year2002 -.1309832 .0675934 -1.94 0.053 -.2634637 .0014973
                        year2003 -.1630397 .0689482 -2.36 0.018 -.2981757 -.0279037
                        year2004 -.1855205 .0718282 -2.58 0.010 -.3263012 -.0447398
                        year2005 -.22868 .0714494 -3.20 0.001 -.3687181 -.0886418
                        year2006 -.2505803 .0756864 -3.31 0.001 -.3989228 -.1022377
                        year2007 -.2839576 .0739453 -3.84 0.000 -.4288878 -.1390274
                        year2008 -.3135869 .0719444 -4.36 0.000 -.4545954 -.1725784
                        year2009 -.3513294 .0768148 -4.57 0.000 -.5018836 -.2007752
                        year2010 -.374602 .082132 -4.56 0.000 -.5355778 -.2136262
                        year2011 -.3981123 .0913657 -4.36 0.000 -.5771859 -.2190387
                        year2012 -.4867316 .0898576 -5.42 0.000 -.6628492 -.310614
                        _cons 8.220524 1.679469 4.89 0.000 4.928826 11.51222

                        sigma_u .26734342
                        sigma_e .10829839
                        rho .85903373 (fraction of variance due to u_i)


                        .
                        end of do-file


                        From what I can see, the SE do not differ much.


                        And now the Sargan Hansen test with cluster

                        quietly xtreg logcons logprice logunemp loggdp lim compr $t, re cluster (Country)

                        . xtoverid

                        Test of overidentifying restrictions: fixed vs random effects
                        Cross-section time-series model: xtreg re robust cluster(Country)
                        Sargan-Hansen statistic 2.1e+04 Chi-sq(15) P-value = 0.0000

                        .
                        end of do-file


                        And now without cluster

                        . quietly xtreg logcons logprice logunemp loggdp lim compr $t, re

                        .
                        . xtoverid

                        Test of overidentifying restrictions: fixed vs random effects
                        Cross-section time-series model: xtreg re
                        Sargan-Hansen statistic 39.880 Chi-sq(15) P-value = 0.0005


                        Both test recommend going with the FE model, right?
                        Thanks a lot Eric! As I am new to Stata I'm really grateful for any help.

                        Comment


                        • #13
                          This is just a personal view. What the Hausman test does is, generally, well stated in the output (it is a matter of selecting a consistent estimator overall, or an efficient estimator). It shouldn't be of much help in other situations. In other words, I fear it is not quite useful to judge the appropriateness of much different models on account of the results of the Hausman test. Basically, these paradoxical results of the Hausman test may be related to the level of endogeneity involving some of the predictors. I believe we should first select the model according to the rationale. Also, we could perform some sort of modeling (adding or excluding variables) followed by post estimations. At this point, a test if a RE or a FE alternative would likely apply. That said, there is even the case where we may choose the FE due to the theoretical background, a given particularity of the field or the main aim, rather than put much weight on the decision according to the result of a single test which, naturally, is subjected to criticisms, pitfalls and limitations.
                          Best regards,

                          Marcos

                          Comment


                          • #14
                            There is a reason for choosing the FE: the country effects are most likely correlated with the dependent variables in your case. What Marcos calls the theoretical background. I would stay with that

                            Comment


                            • #15
                              Yes! I also think that FE is more suitable.

                              There is again another possibility. I have tried it in one paper I’m still writing. But It depend on the size of your sample. I think that it could give you both convergent (i.e. the within estimators) and efficient estimators. This method was elaborated by Mundlak. The stata command which was programed by Green and al. is also available (ssc install mundlak)

                              Comment

                              Working...
                              X