Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Hausman Procedure with Year Dummies

    Hello, I have a question. I am running a regression with panel data (7 years time period for 30 municipalities).
    The model includes the year dummies (starting from 2005 until 2011). My question is: do I run the Hausman test by including the year dummies in the regression or not? Do I run Command 1 where,

    Command 1:
    xtreg y x1 x2 i.year, fe
    estimates store fe
    xtreg y x1 x2 i.year, re
    estimates store re
    hausman fe re

    Or do I perform the Hausman test without the year dummies and then include the year dummies (i.year) in the regression after I get the result from the Hausman test?

    Command 2:
    xtreg y x1 x2, fe
    estimates store fe
    xtreg y x1 x2, fe
    estimates store re
    hausman fe re

    I really want to make sure that I am doing the correct steps. Thank you.

  • #2
    Usha:
    if -i.year- was not omitted due to collinearity, I would run -hausman- including -i.year- among predictors.
    For the future, as recommended in FAQ #12, please post what you typed and what Stata gave you back, too (via CODE delimiters, please). Thanks.
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Hi Carlo, thank you very much for your useful reply. I tried running the command as you suggested yet it exhibits different results. My dependent variable is CGI (a measurement for segregation, ranging from 0 to 1) and my independent variable includes GINI (measurement for inequality, ranging from 0 to 1) and OWN (percentage of house ownership ranging from 0 to 100), the summary can be seen below:
      Code:
          Variable |       Obs        Mean    Std. Dev.       Min        Max
      -------------+--------------------------------------------------------
               cgi |       210    .2156352    .0857581    .010125   .4555475
              gini |       210    .3345222    .0560951     .14884     .52156
               own |       210    73.59243    15.62575      41.81      96.62
      As suggested, here is the result of command which I ran without i.year:
      Code:
      . xtset id year
             panel variable:  id (strongly balanced)
              time variable:  year, 2005 to 2011
                      delta:  1 unit
      
      . xtreg cgi gini own, fe
      
      Fixed-effects (within) regression               Number of obs      =       210
      Group variable: id                              Number of groups   =        30
      
      R-sq:  within  = 0.1771                         Obs per group: min =         7
             between = 0.0025                                        avg =       7.0
             overall = 0.0002                                        max =         7
      
                                                      F(2,178)           =     19.16
      corr(u_i, Xb)  = -0.7705                        Prob > F           =    0.0000
      
      ------------------------------------------------------------------------------
               cgi |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
      -------------+----------------------------------------------------------------
              gini |   .0111995   .0883996     0.13   0.899    -.1632466    .1856456
               own |   .0057712    .000935     6.17   0.000     .0039261    .0076162
             _cons |  -.2128243   .0775178    -2.75   0.007    -.3657965   -.0598521
      -------------+----------------------------------------------------------------
           sigma_u |  .11653562
           sigma_e |  .04836152
               rho |  .85308251   (fraction of variance due to u_i)
      ------------------------------------------------------------------------------
      F test that all u_i=0:     F(29, 178) =    16.06             Prob > F = 0.0000
      
      . estimates store fe
      
      . xtreg cgi gini own, re
      
      Random-effects GLS regression                   Number of obs      =       210
      Group variable: id                              Number of groups   =        30
      
      R-sq:  within  = 0.1685                         Obs per group: min =         7
             between = 0.0016                                        avg =       7.0
             overall = 0.0006                                        max =         7
      
                                                      Wald chi2(2)       =     14.90
      corr(u_i, X)   = 0 (assumed)                    Prob > chi2        =    0.0006
      
      ------------------------------------------------------------------------------
               cgi |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
      -------------+----------------------------------------------------------------
              gini |   .0603295    .091132     0.66   0.508    -.1182859    .2389449
               own |   .0026179   .0006787     3.86   0.000     .0012877    .0039481
             _cons |   .0027941   .0651718     0.04   0.966    -.1249403    .1305285
      -------------+----------------------------------------------------------------
           sigma_u |  .06904026
           sigma_e |  .04836152
               rho |  .67083644   (fraction of variance due to u_i)
      ------------------------------------------------------------------------------
      
      . estimates store re
      
      . hausman fe re
      
                       ---- Coefficients ----
                   |      (b)          (B)            (b-B)     sqrt(diag(V_b-V_B))
                   |       fe           re         Difference          S.E.
      -------------+----------------------------------------------------------------
              gini |    .0111995     .0603295         -.04913               .
               own |    .0057712     .0026179        .0031532        .0006431
      ------------------------------------------------------------------------------
                                 b = consistent under Ho and Ha; obtained from xtreg
                  B = inconsistent under Ha, efficient under Ho; obtained from xtreg
      
          Test:  Ho:  difference in coefficients not systematic
      
                        chi2(2) = (b-B)'[(V_b-V_B)^(-1)](b-B)
                                =       23.82
                      Prob>chi2 =      0.0000
                      (V_b-V_B is not positive definite)
      And here is the result of the command with i.year:
      Code:
      . xtreg cgi gini own i.year, fe
      
      Fixed-effects (within) regression               Number of obs      =       210
      Group variable: id                              Number of groups   =        30
      
      R-sq:  within  = 0.4519                         Obs per group: min =         7
             between = 0.0000                                        avg =       7.0
             overall = 0.0815                                        max =         7
      
                                                      F(8,172)           =     17.72
      corr(u_i, Xb)  = -0.2673                        Prob > F           =    0.0000
      
      ------------------------------------------------------------------------------
               cgi |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
      -------------+----------------------------------------------------------------
              gini |   .2309074   .0916353     2.52   0.013     .0500328     .411782
               own |   .0024196   .0009724     2.49   0.014     .0005001     .004339
                   |
              year |
             2006  |  -.0159735   .0105981    -1.51   0.134    -.0368926    .0049456
             2007  |  -.0577968   .0111162    -5.20   0.000    -.0797386   -.0358551
             2008  |  -.0877906   .0112131    -7.83   0.000    -.1099236   -.0656577
             2009  |  -.0539887   .0111471    -4.84   0.000    -.0759915    -.031986
             2010  |  -.0397789   .0119596    -3.33   0.001    -.0633855   -.0161723
             2011  |  -.0819392   .0121935    -6.72   0.000    -.1060072   -.0578711
                   |
             _cons |   .0085117   .0776114     0.11   0.913    -.1446818    .1617052
      -------------+----------------------------------------------------------------
           sigma_u |  .07758166
           sigma_e |  .04015395
               rho |   .7887189   (fraction of variance due to u_i)
      ------------------------------------------------------------------------------
      F test that all u_i=0:     F(29, 172) =    21.01             Prob > F = 0.0000
      
      . estimates store fe1
      
      . xtreg cgi gini own i.year, re
      
      Random-effects GLS regression                   Number of obs      =       210
      Group variable: id                              Number of groups   =        30
      
      R-sq:  within  = 0.4466                         Obs per group: min =         7
             between = 0.0122                                        avg =       7.0
             overall = 0.1523                                        max =         7
      
                                                      Wald chi2(8)       =    139.44
      corr(u_i, X)   = 0 (assumed)                    Prob > chi2        =    0.0000
      
      ------------------------------------------------------------------------------
               cgi |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
      -------------+----------------------------------------------------------------
              gini |   .2778145   .0880072     3.16   0.002     .1053235    .4503055
               own |   .0012249   .0006461     1.90   0.058    -.0000414    .0024911
                   |
              year |
             2006  |  -.0149128   .0106153    -1.40   0.160    -.0357185    .0058929
             2007  |  -.0604526    .010984    -5.50   0.000    -.0819809   -.0389243
             2008  |  -.0932533   .0107797    -8.65   0.000    -.1143812   -.0721254
             2009  |  -.0588756   .0107676    -5.47   0.000    -.0799797   -.0377715
             2010  |  -.0472762   .0111069    -4.26   0.000    -.0690454   -.0255071
             2011  |  -.0896667   .0113984    -7.87   0.000    -.1120073   -.0673262
                   |
             _cons |   .0846211   .0613673     1.38   0.168    -.0356566    .2048987
      -------------+----------------------------------------------------------------
           sigma_u |  .06978784
           sigma_e |  .04015395
               rho |  .75128506   (fraction of variance due to u_i)
      ------------------------------------------------------------------------------
      
      . estimates store re1
      
      . hausman fe1 re1
      
                       ---- Coefficients ----
                   |      (b)          (B)            (b-B)     sqrt(diag(V_b-V_B))
                   |      fe1          re1         Difference          S.E.
      -------------+----------------------------------------------------------------
              gini |    .2309074     .2778145       -.0469071        .0255296
               own |    .0024196     .0012249        .0011947        .0007268
       2006bn.year |   -.0159735    -.0149128       -.0010608               .
         2007.year |   -.0577968    -.0604526        .0026557        .0017092
         2008.year |   -.0877906    -.0932533        .0054627        .0030871
         2009.year |   -.0539887    -.0588756        .0048869        .0028839
         2010.year |   -.0397789    -.0472762        .0074974         .004435
         2011.year |   -.0819392    -.0896667        .0077276        .0043308
      ------------------------------------------------------------------------------
                                 b = consistent under Ho and Ha; obtained from xtreg
                  B = inconsistent under Ha, efficient under Ho; obtained from xtreg
      
          Test:  Ho:  difference in coefficients not systematic
      
                        chi2(8) = (b-B)'[(V_b-V_B)^(-1)](b-B)
                                =        3.45
                      Prob>chi2 =      0.9030
                      (V_b-V_B is not positive definite)
      As you can see the result is distinctive: Fixed Effects for when i.year is not included whereas Random Effects for when i.year is included. Is there a reason for such different results? Thank you.

      Comment


      • #4
        Usha:
        I guess that your -hausman- test gave back unreliable results, due to -(V_b-V_B is not positive definite)-; see for more on this topic: http://www.stata.com/statalist/archi.../msg00723.html
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment


        • #5
          A couple of additional comments:
          Your time dummies are clearly relevant as they are jointly statistically significant (and most of them also individually). Hence, you should include them also for the Hausman test. However, statistically we cannot distinguish between FE and RE estimates for those time effects. As a consequence, the traditional Hausman test uses the wrong number of the degrees of freedom. Instead of 8, they should only be 2 (the number of variables that vary along both dimensions). You can use the hausman command with the option df(2) to enforce the correct degrees of freedom.

          Please also see the comment by Jeff Wooldridge and his suggestion to use a Mundlak device (correlated random effects model) in a similar topic:
          Hausman test with year dummies, V_b-V-B is not positive definite

          You can read more about the correlated random effects model / Mundlak in the following topic, although it emerged from a different question:
          Time-invariant variables in Fixed-effects model
          https://www.kripfganz.de/stata/

          Comment


          • #6
            Thanks Carlo and Sebastian for the answers. I tried to do what has been suggested and got the results as listed below (also, I revised the regression model):

            Code:
            . xi: xtreg cgi gini emp1 lowskill1 logpop logmed own hs25 sarjana25 eighteen over60 i.year, fe
            i.year            _Iyear_2005-2011    (naturally coded; _Iyear_2005 omitted)
            
            Fixed-effects (within) regression               Number of obs      =       203
            Group variable: id                              Number of groups   =        29
            
            R-sq:  within  = 0.5303                         Obs per group: min =         7
                   between = 0.1904                                        avg =       7.0
                   overall = 0.0000                                        max =         7
            
                                                            F(16,158)          =     11.15
            corr(u_i, Xb)  = -0.6414                        Prob > F           =    0.0000
            
            ------------------------------------------------------------------------------
                     cgi |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
            -------------+----------------------------------------------------------------
                    gini |   .2098422   .0943497     2.22   0.028      .023493    .3961915
                    emp1 |  -.0026784    .001198    -2.24   0.027    -.0050445   -.0003123
               lowskill1 |   .0852168   .0622481     1.37   0.173     -.037729    .2081625
                  logpop |  -.0595715   .0554522    -1.07   0.284    -.1690947    .0499518
                  logmed |  -.0617125   .0504632    -1.22   0.223    -.1613819     .037957
                     own |   .1888378   .1029261     1.83   0.068    -.0144507    .3921263
                    hs25 |  -.0774792   .2608463    -0.30   0.767    -.5926746    .4377162
               sarjana25 |   .5969337   .3447986     1.73   0.085    -.0840753    1.277943
                eighteen |   .0292786   .3245645     0.09   0.928    -.6117662    .6703234
                  over60 |    -1.2718   .6150443    -2.07   0.040     -2.48657   -.0570314
             _Iyear_2006 |   .0143375    .017282     0.83   0.408    -.0197961    .0484711
             _Iyear_2007 |  -.0090962   .0200548    -0.45   0.651    -.0487063    .0305139
             _Iyear_2008 |   -.025067   .0320777    -0.78   0.436    -.0884234    .0382894
             _Iyear_2009 |   .0210048   .0310733     0.68   0.500    -.0403678    .0823774
             _Iyear_2010 |   .0274367   .0355826     0.77   0.442    -.0428422    .0977156
             _Iyear_2011 |   .0174045   .0412467     0.42   0.674    -.0640615    .0988705
                   _cons |   1.849855   1.059795     1.75   0.083     -.243338    3.943047
            -------------+----------------------------------------------------------------
                 sigma_u |  .09637064
                 sigma_e |  .03855145
                     rho |  .86204928   (fraction of variance due to u_i)
            ------------------------------------------------------------------------------
            F test that all u_i=0:     F(28, 158) =     5.44             Prob > F = 0.0000
            
            . xi: xtreg cgi gini emp1 lowskill1 logpop logmed own hs25 sarjana25 eighteen over60 i.year, re
            i.year            _Iyear_2005-2011    (naturally coded; _Iyear_2005 omitted)
            
            Random-effects GLS regression                   Number of obs      =       203
            Group variable: id                              Number of groups   =        29
            
            R-sq:  within  = 0.5156                         Obs per group: min =         7
                   between = 0.7049                                        avg =       7.0
                   overall = 0.6321                                        max =         7
            
                                                            Wald chi2(16)      =    228.90
            corr(u_i, X)   = 0 (assumed)                    Prob > chi2        =    0.0000
            
            ------------------------------------------------------------------------------
                     cgi |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
            -------------+----------------------------------------------------------------
                    gini |   .2054719   .0896372     2.29   0.022     .0297863    .3811576
                    emp1 |  -.0035347   .0009889    -3.57   0.000    -.0054728   -.0015965
               lowskill1 |   .0648403   .0563311     1.15   0.250    -.0455666    .1752472
                  logpop |   .0434794   .0109187     3.98   0.000     .0220792    .0648796
                  logmed |   -.045408   .0324919    -1.40   0.162     -.109091     .018275
                     own |   .2066533   .0700113     2.95   0.003     .0694337    .3438728
                    hs25 |   .0726402   .1986513     0.37   0.715    -.3167091    .4619896
               sarjana25 |   .5680562   .3098139     1.83   0.067    -.0391679     1.17528
                eighteen |   .0512157   .2727291     0.19   0.851    -.4833236     .585755
                  over60 |  -1.391343    .491953    -2.83   0.005    -2.355553   -.4271329
             _Iyear_2006 |   .0094909   .0145438     0.65   0.514    -.0190145    .0379963
             _Iyear_2007 |    -.01527    .016725    -0.91   0.361    -.0480505    .0175105
             _Iyear_2008 |  -.0370664   .0218128    -1.70   0.089    -.0798186    .0056859
             _Iyear_2009 |   .0072028   .0219873     0.33   0.743    -.0358914    .0502971
             _Iyear_2010 |   .0098043   .0221512     0.44   0.658    -.0336113      .05322
             _Iyear_2011 |   .0004769   .0287186     0.02   0.987    -.0558105    .0567642
                   _cons |   .1974652    .483106     0.41   0.683    -.7494053    1.144336
            -------------+----------------------------------------------------------------
                 sigma_u |  .03503294
                 sigma_e |  .03855145
                     rho |  .45229311   (fraction of variance due to u_i)
            ------------------------------------------------------------------------------
            
            . xttest0
            
            Breusch and Pagan Lagrangian multiplier test for random effects
            
                    cgi[id,t] = Xb + u[id] + e[id,t]
            
                    Estimated results:
                                     |       Var     sd = sqrt(Var)
                            ---------+-----------------------------
                                 cgi |   .0064673       .0804197
                                   e |   .0014862       .0385514
                                   u |   .0012273       .0350329
            
                    Test:   Var(u) = 0
                                         chibar2(01) =    73.75
                                      Prob > chibar2 =   0.0000
            
            . xtoverid
            
            Test of overidentifying restrictions: fixed vs random effects
            Cross-section time-series model: xtreg re  
            Sargan-Hansen statistic  10.665  Chi-sq(10)   P-value = 0.3842
            
            
            . testparm i.year
            
             ( 1)  2006.year = 0
             ( 2)  2007.year = 0
             ( 3)  2008.year = 0
             ( 4)  2009.year = 0
             ( 5)  2010.year = 0
             ( 6)  2011.year = 0
            
                       chi2(  6) =   29.03
                     Prob > chi2 =    0.0001
            .
            Does this mean that I have to use random effects? Also, would it be a problem if I use random effects when the referred article stated that a two way fixed effects (OLS + Year Dummy + Region Dummy, "to control for any unobserved attributes of metropolitan areas that do not change over time and could be correlated with both inequality and segregation levels") was used for the regression?

            Thank you very much for the help!

            Comment


            • #7
              I forgot to add. Another problem I have is that CGI (the measurement for segregation) is an index which ranges from zero to one. However, when using fixed effects, the result gives a constant of 1.849855. What could be the source of this problem? Should there be a prior treatment when the dependent variable have a specific range of value, (for instance, in this case 0 to 1)? Thanks!

              Comment


              • #8
                Usha:
                taking a loook at R-sqs, I would go -re-.
                As an aside, I would forget -xi- and create interactions (and categorical variables) via -fvvarlist-.:
                Kind regards,
                Carlo
                (Stata 19.0)

                Comment

                Working...
                X