Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Fixed and random effects regression

    Hi, I am currently performing an pooled ols, fixed and random effects regression. However, after performing all three and going through with the hausman test and the breusch-pragan test i found that the fixed effects model is the model to go with. However, my results are still the majority of the coefficients as low t-values even after including robust in the fixed effects model to help with standard errors. I have a dependent variable of output per hour worked (a measure of productivity) and my independent variables are age, female, higher education, life satisfaction, general health, children in the household, ethnic majority, urban, gross monthly income, in a couple, occupations from 1-7 dummy variables, region (Yorkshire and the UK as a whole) and also 16 dummy variables for broad industries. I let stata know the data is panel set data by using
    xtset industry
    After that I've checked everything and decided the fixed effects model with robust standard errors is the best model. These are my results

    . xtreg outputperhourworked age female lifesatisfaction highereducation grossmonthlyincome generalhealth urban region occ1
    > -occ7, robust fe

    Fixed-effects (within) regression Number of obs = 160
    Group variable: industry Number of groups = 16

    R-squared: Obs per group:
    Within = 0.5771 min = 10
    Between = 0.0053 avg = 10.0
    Overall = 0.0526 max = 10

    F(15,15) = 96060.12
    corr(u_i, Xb) = -0.0838 Prob > F = 0.0000

    (Std. err. adjusted for 16 clusters in industry)
    ------------------------------------------------------------------------------------
    | Robust
    outputperhourwor~d | Coefficient std. err. t P>|t| [95% conf. interval]
    -------------------+----------------------------------------------------------------
    age | .2078659 .1047539 1.98 0.066 -.0154118 .4311437
    female | 2.958096 4.007467 0.74 0.472 -5.583618 11.49981
    lifesatisfaction | 2.233016 4.195188 0.53 0.602 -6.708815 11.17485
    highereducation | -9.022512 4.293936 -2.10 0.053 -18.17482 .1297953
    grossmonthlyincome | .0010065 .0008429 1.19 0.251 -.0007902 .0028031
    generalhealth | 1.581911 4.61318 0.34 0.736 -8.25085 11.41467
    urban | -4.006694 2.964725 -1.35 0.197 -10.32586 2.312467
    region | -3.74473 .7359868 -5.09 0.000 -5.313449 -2.176011
    occ1 | -11.81984 20.39074 -0.58 0.571 -55.28168 31.64201
    occ2 | 25.69072 13.93247 1.84 0.085 -4.005643 55.38708
    occ3 | 10.87094 7.856785 1.38 0.187 -5.875397 27.61728
    occ4 | 5.970971 8.645878 0.69 0.500 -12.45728 24.39922
    occ5 | .9089322 7.28959 0.12 0.902 -14.62846 16.44633
    occ6 | -6.257695 4.096937 -1.53 0.147 -14.99011 2.474719
    occ7 | 3.073744 4.907315 0.63 0.540 -7.385949 13.53344
    _cons | 14.56094 9.799501 1.49 0.158 -6.326203 35.44808
    -------------------+----------------------------------------------------------------
    sigma_u | 10.992994
    sigma_e | 2.549267
    rho | .94896714 (fraction of variance due to u_i)
    ------------------------------------------------------------------------------------

    The r-squared value seems appropriate however the majority of the t-values are still low, is there anything I can do for this or should I carry on with these results or used the pooled ols regression or random effects regression?
    Any help would be appreciated.
    Thanks

  • #2
    Oliver:
    welcome to this forum.
    The -hausman- test does not support non-default standard errors; therefore, it is not clear how you performed it.
    In addition, 16 panels are not enough to be confident that non-default standard errors are nit misleading.
    Eventually, no coefficient in your regression reaches statistical significance: this sounds strange and calls for a double-check of the cirrectness of your regression model specification-
    Last edited by Carlo Lazzaro; 26 Aug 2022, 11:54.
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Hi Carlo, thanks for the reply. So originally I had performed the pooled ols regression without the industry level dummy variables. Like this:
      . reg outputperhourworked age female incouple lifesatisfaction highereducation children grossmonthlyincome generalhealth e
      > thnicmajority urban occ1-occ7 region

      Source | SS df MS Number of obs = 160
      -------------+---------------------------------- F(18, 141) = 12.40
      Model | 12187.817 18 677.100944 Prob > F = 0.0000
      Residual | 7696.61261 141 54.5859051 R-squared = 0.6129
      -------------+---------------------------------- Adj R-squared = 0.5635
      Total | 19884.4296 159 125.059306 Root MSE = 7.3882

      ------------------------------------------------------------------------------------
      outputperhourwor~d | Coefficient Std. err. t P>|t| [95% conf. interval]
      -------------------+----------------------------------------------------------------
      age | -2.091687 .3650759 -5.73 0.000 -2.813417 -1.369957
      female | 9.237893 4.437883 2.08 0.039 .4645016 18.01128
      incouple | 6.362442 9.516697 0.67 0.505 -12.45142 25.1763
      lifesatisfaction | 19.59183 8.017308 2.44 0.016 3.742165 35.4415
      highereducation | 3.865053 7.930896 0.49 0.627 -11.81379 19.54389
      children | -37.9613 10.12877 -3.75 0.000 -57.9852 -17.93741
      grossmonthlyincome | .0048509 .0014836 3.27 0.001 .0019178 .007784
      generalhealth | -11.11036 7.799978 -1.42 0.157 -26.53038 4.309662
      ethnicmajority | 32.59098 13.35653 2.44 0.016 6.186042 58.99592
      urban | -39.49991 7.304811 -5.41 0.000 -53.94103 -25.0588
      occ1 | 122.1594 20.36826 6.00 0.000 81.89273 162.426
      occ2 | 15.67076 15.412 1.02 0.311 -14.7977 46.13922
      occ3 | 30.5377 10.90227 2.80 0.006 8.98467 52.09074
      occ4 | 16.64603 10.22802 1.63 0.106 -3.574065 36.86613
      occ5 | 23.50738 9.260587 2.54 0.012 5.199835 41.81493
      occ6 | 47.06718 13.58978 3.46 0.001 20.20112 73.93324
      occ7 | -9.415526 10.3078 -0.91 0.363 -29.79334 10.96229
      region | -3.198695 1.362628 -2.35 0.020 -5.892516 -.5048734
      _cons | 75.65066 19.33279 3.91 0.000 37.43106 113.8703
      ------------------------------------------------------------------------------------

      Then I went onto regress the fixed effect model:

      . xtset industry

      Panel variable: industry (balanced)

      . xtreg outputperhourworked age female incouple lifesatisfaction highereducation children grossmonthlyincome generalhealth
      > ethnicmajority urban occ1-occ7 region, fe

      Fixed-effects (within) regression Number of obs = 160
      Group variable: industry Number of groups = 16

      R-squared: Obs per group:
      Within = 0.5879 min = 10
      Between = 0.0036 avg = 10.0
      Overall = 0.0261 max = 10

      F(18,126) = 9.99
      corr(u_i, Xb) = -0.1426 Prob > F = 0.0000

      ------------------------------------------------------------------------------------
      outputperhourwor~d | Coefficient Std. err. t P>|t| [95% conf. interval]
      -------------------+----------------------------------------------------------------
      age | .2117937 .1642327 1.29 0.200 -.1132179 .5368053
      female | 4.089059 4.021471 1.02 0.311 -3.869314 12.04743
      incouple | -4.89975 3.81921 -1.28 0.202 -12.45785 2.658355
      lifesatisfaction | 2.587099 3.188801 0.81 0.419 -3.723444 8.897641
      highereducation | -11.23304 5.169229 -2.17 0.032 -21.46279 -1.003285
      children | -4.171664 4.826863 -0.86 0.389 -13.72388 5.380555
      grossmonthlyincome | .001408 .0007007 2.01 0.047 .0000212 .0027947
      generalhealth | 1.908362 4.194004 0.46 0.650 -6.391449 10.20817
      ethnicmajority | 4.843447 5.912025 0.82 0.414 -6.856277 16.54317
      urban | -1.49449 4.186699 -0.36 0.722 -9.779844 6.790864
      occ1 | -10.15136 12.80911 -0.79 0.430 -35.50021 15.19749
      occ2 | 21.74343 11.48948 1.89 0.061 -.9939112 44.48077
      occ3 | 12.9459 7.313564 1.77 0.079 -1.527428 27.41923
      occ4 | 7.096606 8.712475 0.81 0.417 -10.14513 24.33834
      occ5 | 6.077322 7.416389 0.82 0.414 -8.599494 20.75414
      occ6 | -4.054957 6.805379 -0.60 0.552 -17.5226 9.412688
      occ7 | 7.553854 7.284875 1.04 0.302 -6.862699 21.97041
      region | -3.826225 .5744232 -6.66 0.000 -4.962992 -2.689458
      _cons | 9.833199 10.50739 0.94 0.351 -10.96062 30.62702
      -------------------+----------------------------------------------------------------
      sigma_u | 11.235124
      sigma_e | 2.5462105
      rho | .95114815 (fraction of variance due to u_i)
      ------------------------------------------------------------------------------------
      F test that all u_i=0: F(15, 126) = 70.74 Prob > F = 0.0000

      Then i regressed the random effects model:

      Random-effects GLS regression Number of obs = 160
      Group variable: industry Number of groups = 16

      R-squared: Obs per group:
      Within = 0.1357 min = 10
      Between = 0.8229 avg = 10.0
      Overall = 0.6129 max = 10

      Wald chi2(18) = 223.28
      corr(u_i, X) = 0 (assumed) Prob > chi2 = 0.0000

      ------------------------------------------------------------------------------------
      outputperhourwor~d | Coefficient Std. err. z P>|z| [95% conf. interval]
      -------------------+----------------------------------------------------------------
      age | -2.091687 .3650759 -5.73 0.000 -2.807223 -1.376152
      female | 9.237893 4.437883 2.08 0.037 .5398014 17.93598
      incouple | 6.362442 9.516697 0.67 0.504 -12.28994 25.01482
      lifesatisfaction | 19.59183 8.017308 2.44 0.015 3.878199 35.30547
      highereducation | 3.865053 7.930896 0.49 0.626 -11.67922 19.40932
      children | -37.9613 10.12877 -3.75 0.000 -57.81334 -18.10927
      grossmonthlyincome | .0048509 .0014836 3.27 0.001 .001943 .0077588
      generalhealth | -11.11036 7.799978 -1.42 0.154 -26.39804 4.177316
      ethnicmajority | 32.59098 13.35653 2.44 0.015 6.412669 58.7693
      urban | -39.49991 7.304811 -5.41 0.000 -53.81708 -25.18275
      occ1 | 122.1594 20.36826 6.00 0.000 82.23833 162.0804
      occ2 | 15.67076 15.412 1.02 0.309 -14.5362 45.87771
      occ3 | 30.5377 10.90227 2.80 0.005 9.169654 51.90575
      occ4 | 16.64603 10.22802 1.63 0.104 -3.400521 36.69259
      occ5 | 23.50738 9.260587 2.54 0.011 5.356964 41.6578
      occ6 | 47.06718 13.58978 3.46 0.001 20.4317 73.70266
      occ7 | -9.415526 10.3078 -0.91 0.361 -29.61844 10.78739
      region | -3.198695 1.362628 -2.35 0.019 -5.869396 -.5279938
      _cons | 75.65066 19.33279 3.91 0.000 37.75909 113.5422
      -------------------+----------------------------------------------------------------
      sigma_u | 0
      sigma_e | 2.5462105
      rho | 0 (fraction of variance due to u_i)
      ------------------------------------------------------------------------------------

      Then i used the hausman test which gave:


      . hausman fe re

      Note: the rank of the differenced variance matrix (17) does not equal the number of coefficients being tested (18); be
      sure this is what you expect, or there may be problems computing the test. Examine the output of your estimators
      for anything unexpected and possibly consider scaling your variables so that the coefficients are on a similar
      scale.

      ---- Coefficients ----
      | (b) (B) (b-B) sqrt(diag(V_b-V_B))
      | fe re Difference Std. err.
      -------------+----------------------------------------------------------------
      age | .2117937 -2.091687 2.303481 .
      female | 4.089059 9.237893 -5.148834 .
      incouple | -4.89975 6.362442 -11.26219 .
      lifesatisf~n | 2.587099 19.59183 -17.00474 .
      highereduc~n | -11.23304 3.865053 -15.09809 .
      children | -4.171664 -37.9613 33.78964 .
      grossmonth~e | .001408 .0048509 -.0034429 .
      generalhea~h | 1.908362 -11.11036 13.01872 .
      ethnicmajo~y | 4.843447 32.59098 -27.74754 .
      urban | -1.49449 -39.49991 38.00542 .
      occ1 | -10.15136 122.1594 -132.3107 .
      occ2 | 21.74343 15.67076 6.072671 .
      occ3 | 12.9459 30.5377 -17.5918 .
      occ4 | 7.096606 16.64603 -9.549428 .
      occ5 | 6.077322 23.50738 -17.43006 .
      occ6 | -4.054957 47.06718 -51.12214 .
      occ7 | 7.553854 -9.415526 16.96938 .
      region | -3.826225 -3.198695 -.62753 .
      ------------------------------------------------------------------------------
      b = Consistent under H0 and Ha; obtained from xtreg.
      B = Inconsistent under Ha, efficient under H0; obtained from xtreg.

      Test of H0: Difference in coefficients not systematic

      chi2(17) = (b-B)'[(V_b-V_B)^(-1)](b-B)
      = -700.85

      Warning: chi2 < 0 ==> model fitted on these data
      fails to meet the asymptotic assumptions
      of the Hausman test; see suest for a
      generalized test.


      so I used the sigmamore addition to give:

      . hausman fe re, sigmamore

      Note: the rank of the differenced variance matrix (15) does not equal the number of coefficients being tested (18); be
      sure this is what you expect, or there may be problems computing the test. Examine the output of your estimators
      for anything unexpected and possibly consider scaling your variables so that the coefficients are on a similar
      scale.

      ---- Coefficients ----
      | (b) (B) (b-B) sqrt(diag(V_b-V_B))
      | fe re Difference Std. err.
      -------------+----------------------------------------------------------------
      age | .2117937 -2.091687 2.303481 .3062946
      female | 4.089059 9.237893 -5.148834 10.79208
      incouple | -4.89975 6.362442 -11.26219 5.678378
      lifesatisf~n | 2.587099 19.59183 -17.00474 4.619213
      highereduc~n | -11.23304 3.865053 -15.09809 12.73109
      children | -4.171664 -37.9613 33.78964 9.673316
      grossmonth~e | .001408 .0048509 -.0034429 .0013904
      generalhea~h | 1.908362 -11.11036 13.01872 9.341224
      ethnicmajo~y | 4.843447 32.59098 -27.74754 10.76502
      urban | -1.49449 -39.49991 38.00542 9.706822
      occ1 | -10.15136 122.1594 -132.3107 31.08965
      occ2 | 21.74343 15.67076 6.072671 29.56226
      occ3 | 12.9459 30.5377 -17.5918 18.20688
      occ4 | 7.096606 16.64603 -9.549428 23.11918
      occ5 | 6.077322 23.50738 -17.43006 19.42534
      occ6 | -4.054957 47.06718 -51.12214 14.32678
      occ7 | 7.553854 -9.415526 16.96938 18.45462
      region | -3.826225 -3.198695 -.62753 .9598942
      ------------------------------------------------------------------------------
      b = Consistent under H0 and Ha; obtained from xtreg.
      B = Inconsistent under Ha, efficient under H0; obtained from xtreg.

      Test of H0: Difference in coefficients not systematic

      chi2(15) = (b-B)'[(V_b-V_B)^(-1)](b-B)
      = 126.03
      Prob > chi2 = 0.0000
      (V_b-V_B is not positive definite)

      This tells me that I should use the fixed effects model however as a precautionary I did the Breusch-Pagan Lagrange multiplier. This gave:

      . xttest0

      Breusch and Pagan Lagrangian multiplier test for random effects

      outputperhourworked[industry,t] = Xb + u[industry] + e[industry,t]

      Estimated results:
      | Var SD = sqrt(Var)
      ---------+-----------------------------
      outputp~d | 125.0593 11.18299
      e | 6.483188 2.546211
      u | 0 0

      Test: Var(u) = 0
      chibar2(01) = 0.00
      Prob > chibar2 = 1.0000

      From this then I checked the heterosckedasticity of the fixed effects model:

      . xttest3

      Modified Wald test for groupwise heteroskedasticity
      in fixed effect regression model

      H0: sigma(i)^2 = sigma^2 for all i

      chi2 (16) = 1186.33
      Prob>chi2 = 0.0000

      So from this i decide to use the robust addition to the fixed effects model which gives :

      . xtreg outputperhourworked age female incouple lifesatisfaction highereducation children grossmonthlyincome generalhealth
      > ethnicmajority urban occ1-occ7 region, robust fe

      Fixed-effects (within) regression Number of obs = 160
      Group variable: industry Number of groups = 16

      R-squared: Obs per group:
      Within = 0.5879 min = 10
      Between = 0.0036 avg = 10.0
      Overall = 0.0261 max = 10

      F(15,15) = .
      corr(u_i, Xb) = -0.1426 Prob > F = .

      (Std. err. adjusted for 16 clusters in industry)
      ------------------------------------------------------------------------------------
      | Robust
      outputperhourwor~d | Coefficient std. err. t P>|t| [95% conf. interval]
      -------------------+----------------------------------------------------------------
      age | .2117937 .1972323 1.07 0.300 -.208597 .6321844
      female | 4.089059 4.012366 1.02 0.324 -4.463097 12.64121
      incouple | -4.89975 4.403377 -1.11 0.283 -14.28533 4.485826
      lifesatisfaction | 2.587099 4.156245 0.62 0.543 -6.271728 11.44593
      highereducation | -11.23304 4.482184 -2.51 0.024 -20.78659 -1.679489
      children | -4.171664 5.229833 -0.80 0.438 -15.31879 6.97546
      grossmonthlyincome | .001408 .0008818 1.60 0.131 -.0004715 .0032875
      generalhealth | 1.908362 5.440297 0.35 0.731 -9.687357 13.50408
      ethnicmajority | 4.843447 4.641975 1.04 0.313 -5.050689 14.73758
      urban | -1.49449 3.20119 -0.47 0.647 -8.317664 5.328684
      occ1 | -10.15136 22.45018 -0.45 0.658 -58.00279 37.70007
      occ2 | 21.74343 13.15912 1.65 0.119 -6.304562 49.79142
      occ3 | 12.9459 7.491032 1.73 0.104 -3.020857 28.91266
      occ4 | 7.096606 9.876709 0.72 0.483 -13.9551 28.14831
      occ5 | 6.077322 6.362956 0.96 0.355 -7.484997 19.63964
      occ6 | -4.054957 3.760967 -1.08 0.298 -12.07127 3.961355
      occ7 | 7.553854 4.79697 1.57 0.136 -2.670645 17.77835
      region | -3.826225 .7289406 -5.25 0.000 -5.379925 -2.272525
      _cons | 9.833199 13.66569 0.72 0.483 -19.29454 38.96094
      -------------------+----------------------------------------------------------------
      sigma_u | 11.235124
      sigma_e | 2.5462105
      rho | .95114815 (fraction of variance due to u_i)
      ------------------------------------------------------------------------------------

      The problem with this result is that the r-squared value is appropriate however i still receive the majority of the coefficients are still insignificant. I was wondering on whether to carry on to use this fixed effect regression or the pooled ols regression or the random effects model.
      There is 16 panels but the data is aggregated for all the other variables and so it consists of many observations just aggregated to an indutsry level for 16 industries in 5 years.
      Any help would be appreciated.
      thank you

      Comment


      • #4
        Oliver:
        please use CODE delimiters to share what you typed and what Stata gave you back. Thanks.
        That said:
        1) a pooled OLS without -i.industry- (and standard errors clustered on -industry-) as a predictor makes no sense at all;
        2) assuming that you have a -repeated time values within panel- issue with -xtset-, it is not clear why you did not include -i.year- among the predictors of your -xtreg,fe- equation;
        3) you have heteroskedastcity. Therefore, even though the number of clusters is low, it makes sense to go -xtreg,fe- with cluster-robust standard errors;
        4) -xtreg,re- is aout of debate as per -xttest0- outcome;
        5) what you shoud do is to test the correct specification of the functional form of the regressand in your -xtreg,fe- model, as per the following toy-example:
        Code:
        . xtreg ln_wage fitted sq_fitted , fe vce(cluster idcode)
        
        Fixed-effects (within) regression               Number of obs     =     28,510
        Group variable: idcode                          Number of groups  =      4,710
        
        R-squared:                                      Obs per group:
             Within  = 0.1092                                         min =          1
             Between = 0.1033                                         avg =        6.1
             Overall = 0.0881                                         max =         15
        
                                                        F(2,4709)         =     523.09
        corr(u_i, Xb) = 0.0467                          Prob > F          =     0.0000
        
                                     (Std. err. adjusted for 4,710 clusters in idcode)
        ------------------------------------------------------------------------------
                     |               Robust
             ln_wage | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
        -------------+----------------------------------------------------------------
              fitted |   2.569185   .7085064     3.63   0.000     1.180181    3.958189
           sq_fitted |    -.47432   .2153021    -2.20   0.028    -.8964128   -.0522272
               _cons |  -1.290258    .580562    -2.22   0.026    -2.428431   -.1520844
        -------------+----------------------------------------------------------------
             sigma_u |    .403403
             sigma_e |  .30238578
                 rho |  .64025357   (fraction of variance due to u_i)
        ------------------------------------------------------------------------------
        
        .
        As -sq_fitted- reaches statitsical significance, the model is misspecified.
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment


        • #5
          Now, this is my pooled ols regression including industries:
          Code:
          reg outputperhourworked age female incouple lifesatisfaction highereducation children grossmonthlyincome generalhealth ethnicmajority urban ind1-ind15
          As you said I have included the i.wave in the xtreg, fe equation which gives:
          Code:
          xtreg outputperhourworked age female incouple lifesatisfaction highereducation children grossmonthlyincome generalhealth ethnicmajority urban region i.wave, fe
          Then finally used xtreg, fe with robust standard errors which gives:
          Code:
          xtreg outputperhourworked age female incouple lifesatisfaction highereducation children grossmonthlyincome generalhealth ethnicmajority urban region i.wave, robust fe
          I'm sorry but I'm really stuck as to whether to just carry on with this regression and explain it's floors or try and fix it as much as possible or drop variables to help improve its consistency.
          Thanks

          Comment


          • #6
            Now, this is my pooled ols regression including industries:
            Code:
            reg outputperhourworked age female incouple lifesatisfaction highereducation children grossmonthlyincome generalhealth ethnicmajority urban ind1-ind15
                  Source |       SS           df       MS      Number of obs   =       160
            -------------+----------------------------------   F(25, 134)      =     80.10
                   Model |  18637.3043        25  745.492173   Prob > F        =    0.0000
                Residual |  1247.12528       134  9.30690506   R-squared       =    0.9373
            -------------+----------------------------------   Adj R-squared   =    0.9256
                   Total |  19884.4296       159  125.059306   Root MSE        =    3.0507
            
            ------------------------------------------------------------------------------------
            outputperhourwor~d | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
            -------------------+----------------------------------------------------------------
                           age |   .2748922   .1749598     1.57   0.119    -.0711477    .6209322
                        female |   4.790014    3.91466     1.22   0.223    -2.952502    12.53253
                      incouple |  -5.923835   4.461654    -1.33   0.187    -14.74821    2.900539
              lifesatisfaction |  -.4209222   3.204008    -0.13   0.896    -6.757891    5.916046
               highereducation |  -11.82595   5.525567    -2.14   0.034    -22.75456   -.8973413
                      children |  -10.10135   4.291029    -2.35   0.020    -18.58825   -1.614438
            grossmonthlyincome |   .0036281    .000617     5.88   0.000     .0024077    .0048484
                 generalhealth |    6.18834   3.767032     1.64   0.103    -1.262192    13.63887
                ethnicmajority |    2.55508   6.374079     0.40   0.689    -10.05174     15.1619
                         urban |  -15.33324   3.824228    -4.01   0.000    -22.89689   -7.769579
                          ind1 |   27.62078   3.018881     9.15   0.000     21.64996     33.5916
                          ind2 |   15.72706    2.87707     5.47   0.000     10.03672    21.41741
                          ind3 |   5.812294   3.181078     1.83   0.070    -.4793249    12.10391
                          ind4 |   36.17205   3.068046    11.79   0.000     30.10399    42.24011
                          ind5 |   10.15522   2.437754     4.17   0.000     5.333771    14.97668
                          ind6 |   4.281885   2.217484     1.93   0.056    -.1039126    8.667683
                          ind7 |   4.581402   2.807222     1.63   0.105    -.9707948     10.1336
                          ind8 |  -.7753631   2.342294    -0.33   0.741    -5.408012    3.857286
                          ind9 |   28.39202   3.318566     8.56   0.000     21.82848    34.95557
                         ind10 |   9.816667   3.159917     3.11   0.002     3.566901    16.06643
                         ind11 |    1.28196   2.414443     0.53   0.596    -3.493386    6.057307
                         ind12 |   10.04032   2.876909     3.49   0.001     4.350295    15.73034
                         ind13 |   11.14044   3.174646     3.51   0.001     4.861549    17.41934
                         ind14 |   2.915025   2.466157     1.18   0.239    -1.962603    7.792654
                         ind15 |   2.553465   3.163255     0.81   0.421    -3.702902    8.809833
                         _cons |   7.472767   11.23417     0.67   0.507    -14.74646    29.69199
            ------------------------------------------------------------------------------------
            As you said I have included the i.wave in the xtreg, fe equation which gives:
            Code:
            xtreg outputperhourworked age female incouple lifesatisfaction highereducation children grossmonthlyincome generalhealth ethnicmajority urban region i.year, fe
            Fixed-effects (within) regression               Number of obs     =        160
            Group variable: industry                        Number of groups  =         16
            
            R-squared:                                      Obs per group:
                 Within  = 0.5455                                         min =         10
                 Between = 0.0263                                         avg =       10.0
                 Overall = 0.0142                                         max =         10
            
                                                            F(15,129)         =      10.32
            corr(u_i, Xb) = -0.1565                         Prob > F          =     0.0000
            
            ------------------------------------------------------------------------------------
            outputperhourwor~d | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
            -------------------+----------------------------------------------------------------
                           age |   .0949945   .1676812     0.57   0.572    -.2367669    .4267559
                        female |    3.13867   3.446668     0.91   0.364    -3.680647    9.957988
                      incouple |  -5.931795   4.000349    -1.48   0.141    -13.84658    1.982993
              lifesatisfaction |   2.916755   3.299943     0.88   0.378    -3.612262    9.445773
               highereducation |  -10.71285    4.94944    -2.16   0.032    -20.50544   -.9202642
                      children |  -5.001213    3.96734    -1.26   0.210    -12.85069    2.848267
            grossmonthlyincome |   .0016221   .0006904     2.35   0.020     .0002562     .002988
                 generalhealth |   3.680417   3.441799     1.07   0.287    -3.129267     10.4901
                ethnicmajority |   11.20793   5.736378     1.95   0.053    -.1416344    22.55749
                         urban |  -4.849715   3.683193    -1.32   0.190      -12.137    2.437572
                        region |  -3.934039   .6097958    -6.45   0.000    -5.140535   -2.727543
                               |
                          year |
                         2010  |   .1648045   .6929738     0.24   0.812    -1.206261     1.53587
                         2011  |  -.2509595   .7855941    -0.32   0.750    -1.805277    1.303358
                         2012  |  -.0724003   .7496864    -0.10   0.923    -1.555673    1.410873
                         2013  |   .0113145   .8777164     0.01   0.990    -1.725269    1.747898
                               |
                         _cons |   17.80361    9.07735     1.96   0.052    -.1561492    35.76337
            -------------------+----------------------------------------------------------------
                       sigma_u |   11.30491
                       sigma_e |  2.6429505
                           rho |  .94817579   (fraction of variance due to u_i)
            ------------------------------------------------------------------------------------
            F test that all u_i=0: F(15, 129) = 102.29                   Prob > F = 0.0000
            Then finally used xtreg, fe with robust standard errors which gives:
            Code:
            xtreg outputperhourworked age female incouple lifesatisfaction highereducation children grossmonthlyincome generalhealth ethnicmajority urban region i.year, robust fe
            
            Fixed-effects (within) regression               Number of obs     =        160
            Group variable: industry                        Number of groups  =         16
            
            R-squared:                                      Obs per group:
                 Within  = 0.5455                                         min =         10
                 Between = 0.0263                                         avg =       10.0
                 Overall = 0.0142                                         max =         10
            
                                                            F(15,15)          =    2459.22
            corr(u_i, Xb) = -0.1565                         Prob > F          =     0.0000
            
                                                (Std. err. adjusted for 16 clusters in industry)
            ------------------------------------------------------------------------------------
                               |               Robust
            outputperhourwor~d | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
            -------------------+----------------------------------------------------------------
                           age |   .0949945   .1587471     0.60   0.559     -.243367     .433356
                        female |    3.13867   5.518367     0.57   0.578    -8.623451    14.90079
                      incouple |  -5.931795   4.444093    -1.33   0.202    -15.40415    3.540565
              lifesatisfaction |   2.916755   4.074887     0.72   0.485     -5.76866    11.60217
               highereducation |  -10.71285   5.047343    -2.12   0.051    -21.47101    .0453057
                      children |  -5.001213   5.170988    -0.97   0.349    -16.02291    6.020486
            grossmonthlyincome |   .0016221   .0011311     1.43   0.172    -.0007888     .004033
                 generalhealth |   3.680417   5.102521     0.72   0.482    -7.195348    14.55618
                ethnicmajority |   11.20793   9.162996     1.22   0.240    -8.322535    30.73839
                         urban |  -4.849715   2.958943    -1.64   0.122    -11.15655    1.457123
                        region |  -3.934039   .6250035    -6.29   0.000    -5.266202   -2.601876
                               |
                          year |
                         2010  |   .1648045    .842043     0.20   0.847    -1.629968    1.959577
                         2011  |  -.2509595   .6032141    -0.42   0.683     -1.53668    1.034761
                         2012  |  -.0724003   .7672533    -0.09   0.926    -1.707762    1.562961
                         2013  |   .0113145   1.008143     0.01   0.991    -2.137492    2.160121
                               |
                         _cons |   17.80361   16.98761     1.05   0.311    -18.40461    54.01183
            -------------------+----------------------------------------------------------------
                       sigma_u |   11.30491
                       sigma_e |  2.6429505
                           rho |  .94817579   (fraction of variance due to u_i)
            ------------------------------------------------------------------------------------
            I'm sorry but I'm really stuck as to whether to just carry on with this regression and explain it's floors or try and fix it as much as possible or drop variables to help improve its consistency. Any advice would be helpful.
            Thanks

            Comment


            • #7
              Oliver:
              1) your pooled OLS possibly has a dummy in excess (ind1-ind15; or: have you got 16 -ind*- dummies and you've already omitted one of them?) and lacks of robust standard error if you detect heteroskedasticity. I'd not consider clustered standard errors here as, with 16 panels only, they may be more misleading that their default counterparts;
              2) your -fe- regression has an overfitting problem (the F-test is highly significant, while most of your coefficients are not). Try a more parsimonious model and see whether things get better.
              Kind regards,
              Carlo
              (Stata 19.0)

              Comment

              Working...
              X