Fixed and random effects regression

Oliver Brennan

Join Date: Aug 2022

Posts: 4
#1

Fixed and random effects regression

26 Aug 2022, 10:50

Hi, I am currently performing an pooled ols, fixed and random effects regression. However, after performing all three and going through with the hausman test and the breusch-pragan test i found that the fixed effects model is the model to go with. However, my results are still the majority of the coefficients as low t-values even after including robust in the fixed effects model to help with standard errors. I have a dependent variable of output per hour worked (a measure of productivity) and my independent variables are age, female, higher education, life satisfaction, general health, children in the household, ethnic majority, urban, gross monthly income, in a couple, occupations from 1-7 dummy variables, region (Yorkshire and the UK as a whole) and also 16 dummy variables for broad industries. I let stata know the data is panel set data by using
xtset industry
After that I've checked everything and decided the fixed effects model with robust standard errors is the best model. These are my results

. xtreg outputperhourworked age female lifesatisfaction highereducation grossmonthlyincome generalhealth urban region occ1
> -occ7, robust fe

Fixed-effects (within) regression Number of obs = 160
Group variable: industry Number of groups = 16

R-squared: Obs per group:
Within = 0.5771 min = 10
Between = 0.0053 avg = 10.0
Overall = 0.0526 max = 10

F(15,15) = 96060.12
corr(u_i, Xb) = -0.0838 Prob > F = 0.0000

(Std. err. adjusted for 16 clusters in industry)
------------------------------------------------------------------------------------
| Robust
outputperhourwor~d | Coefficient std. err. t P>|t| [95% conf. interval]
-------------------+----------------------------------------------------------------
age | .2078659 .1047539 1.98 0.066 -.0154118 .4311437
female | 2.958096 4.007467 0.74 0.472 -5.583618 11.49981
lifesatisfaction | 2.233016 4.195188 0.53 0.602 -6.708815 11.17485
highereducation | -9.022512 4.293936 -2.10 0.053 -18.17482 .1297953
grossmonthlyincome | .0010065 .0008429 1.19 0.251 -.0007902 .0028031
generalhealth | 1.581911 4.61318 0.34 0.736 -8.25085 11.41467
urban | -4.006694 2.964725 -1.35 0.197 -10.32586 2.312467
region | -3.74473 .7359868 -5.09 0.000 -5.313449 -2.176011
occ1 | -11.81984 20.39074 -0.58 0.571 -55.28168 31.64201
occ2 | 25.69072 13.93247 1.84 0.085 -4.005643 55.38708
occ3 | 10.87094 7.856785 1.38 0.187 -5.875397 27.61728
occ4 | 5.970971 8.645878 0.69 0.500 -12.45728 24.39922
occ5 | .9089322 7.28959 0.12 0.902 -14.62846 16.44633
occ6 | -6.257695 4.096937 -1.53 0.147 -14.99011 2.474719
occ7 | 3.073744 4.907315 0.63 0.540 -7.385949 13.53344
_cons | 14.56094 9.799501 1.49 0.158 -6.326203 35.44808
-------------------+----------------------------------------------------------------
sigma_u | 10.992994
sigma_e | 2.549267
rho | .94896714 (fraction of variance due to u_i)
------------------------------------------------------------------------------------

The r-squared value seems appropriate however the majority of the t-values are still low, is there anything I can do for this or should I carry on with these results or used the pooled ols regression or random effects regression?
Any help would be appreciated.
Thanks
Tags: None
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17851
#2

26 Aug 2022, 11:17

Oliver:
welcome to this forum.
The -hausman- test does not support non-default standard errors; therefore, it is not clear how you performed it.
In addition, 16 panels are not enough to be confident that non-default standard errors are nit misleading.
Eventually, no coefficient in your regression reaches statistical significance: this sounds strange and calls for a double-check of the cirrectness of your regression model specification-

Last edited by Carlo Lazzaro; 26 Aug 2022, 11:54.

Kind regards,
Carlo
(Stata 19.0)
Comment
Oliver Brennan

Join Date: Aug 2022

Posts: 4
#3

26 Aug 2022, 13:16

Hi Carlo, thanks for the reply. So originally I had performed the pooled ols regression without the industry level dummy variables. Like this:
. reg outputperhourworked age female incouple lifesatisfaction highereducation children grossmonthlyincome generalhealth e
> thnicmajority urban occ1-occ7 region

Source | SS df MS Number of obs = 160
-------------+---------------------------------- F(18, 141) = 12.40
Model | 12187.817 18 677.100944 Prob > F = 0.0000
Residual | 7696.61261 141 54.5859051 R-squared = 0.6129
-------------+---------------------------------- Adj R-squared = 0.5635
Total | 19884.4296 159 125.059306 Root MSE = 7.3882

------------------------------------------------------------------------------------
outputperhourwor~d | Coefficient Std. err. t P>|t| [95% conf. interval]
-------------------+----------------------------------------------------------------
age | -2.091687 .3650759 -5.73 0.000 -2.813417 -1.369957
female | 9.237893 4.437883 2.08 0.039 .4645016 18.01128
incouple | 6.362442 9.516697 0.67 0.505 -12.45142 25.1763
lifesatisfaction | 19.59183 8.017308 2.44 0.016 3.742165 35.4415
highereducation | 3.865053 7.930896 0.49 0.627 -11.81379 19.54389
children | -37.9613 10.12877 -3.75 0.000 -57.9852 -17.93741
grossmonthlyincome | .0048509 .0014836 3.27 0.001 .0019178 .007784
generalhealth | -11.11036 7.799978 -1.42 0.157 -26.53038 4.309662
ethnicmajority | 32.59098 13.35653 2.44 0.016 6.186042 58.99592
urban | -39.49991 7.304811 -5.41 0.000 -53.94103 -25.0588
occ1 | 122.1594 20.36826 6.00 0.000 81.89273 162.426
occ2 | 15.67076 15.412 1.02 0.311 -14.7977 46.13922
occ3 | 30.5377 10.90227 2.80 0.006 8.98467 52.09074
occ4 | 16.64603 10.22802 1.63 0.106 -3.574065 36.86613
occ5 | 23.50738 9.260587 2.54 0.012 5.199835 41.81493
occ6 | 47.06718 13.58978 3.46 0.001 20.20112 73.93324
occ7 | -9.415526 10.3078 -0.91 0.363 -29.79334 10.96229
region | -3.198695 1.362628 -2.35 0.020 -5.892516 -.5048734
_cons | 75.65066 19.33279 3.91 0.000 37.43106 113.8703
------------------------------------------------------------------------------------

Then I went onto regress the fixed effect model:

. xtset industry

Panel variable: industry (balanced)

. xtreg outputperhourworked age female incouple lifesatisfaction highereducation children grossmonthlyincome generalhealth
> ethnicmajority urban occ1-occ7 region, fe

Fixed-effects (within) regression Number of obs = 160
Group variable: industry Number of groups = 16

R-squared: Obs per group:
Within = 0.5879 min = 10
Between = 0.0036 avg = 10.0
Overall = 0.0261 max = 10

F(18,126) = 9.99
corr(u_i, Xb) = -0.1426 Prob > F = 0.0000

------------------------------------------------------------------------------------
outputperhourwor~d | Coefficient Std. err. t P>|t| [95% conf. interval]
-------------------+----------------------------------------------------------------
age | .2117937 .1642327 1.29 0.200 -.1132179 .5368053
female | 4.089059 4.021471 1.02 0.311 -3.869314 12.04743
incouple | -4.89975 3.81921 -1.28 0.202 -12.45785 2.658355
lifesatisfaction | 2.587099 3.188801 0.81 0.419 -3.723444 8.897641
highereducation | -11.23304 5.169229 -2.17 0.032 -21.46279 -1.003285
children | -4.171664 4.826863 -0.86 0.389 -13.72388 5.380555
grossmonthlyincome | .001408 .0007007 2.01 0.047 .0000212 .0027947
generalhealth | 1.908362 4.194004 0.46 0.650 -6.391449 10.20817
ethnicmajority | 4.843447 5.912025 0.82 0.414 -6.856277 16.54317
urban | -1.49449 4.186699 -0.36 0.722 -9.779844 6.790864
occ1 | -10.15136 12.80911 -0.79 0.430 -35.50021 15.19749
occ2 | 21.74343 11.48948 1.89 0.061 -.9939112 44.48077
occ3 | 12.9459 7.313564 1.77 0.079 -1.527428 27.41923
occ4 | 7.096606 8.712475 0.81 0.417 -10.14513 24.33834
occ5 | 6.077322 7.416389 0.82 0.414 -8.599494 20.75414
occ6 | -4.054957 6.805379 -0.60 0.552 -17.5226 9.412688
occ7 | 7.553854 7.284875 1.04 0.302 -6.862699 21.97041
region | -3.826225 .5744232 -6.66 0.000 -4.962992 -2.689458
_cons | 9.833199 10.50739 0.94 0.351 -10.96062 30.62702
-------------------+----------------------------------------------------------------
sigma_u | 11.235124
sigma_e | 2.5462105
rho | .95114815 (fraction of variance due to u_i)
------------------------------------------------------------------------------------
F test that all u_i=0: F(15, 126) = 70.74 Prob > F = 0.0000

Then i regressed the random effects model:

Random-effects GLS regression Number of obs = 160
Group variable: industry Number of groups = 16

R-squared: Obs per group:
Within = 0.1357 min = 10
Between = 0.8229 avg = 10.0
Overall = 0.6129 max = 10

Wald chi2(18) = 223.28
corr(u_i, X) = 0 (assumed) Prob > chi2 = 0.0000

------------------------------------------------------------------------------------
outputperhourwor~d | Coefficient Std. err. z P>|z| [95% conf. interval]
-------------------+----------------------------------------------------------------
age | -2.091687 .3650759 -5.73 0.000 -2.807223 -1.376152
female | 9.237893 4.437883 2.08 0.037 .5398014 17.93598
incouple | 6.362442 9.516697 0.67 0.504 -12.28994 25.01482
lifesatisfaction | 19.59183 8.017308 2.44 0.015 3.878199 35.30547
highereducation | 3.865053 7.930896 0.49 0.626 -11.67922 19.40932
children | -37.9613 10.12877 -3.75 0.000 -57.81334 -18.10927
grossmonthlyincome | .0048509 .0014836 3.27 0.001 .001943 .0077588
generalhealth | -11.11036 7.799978 -1.42 0.154 -26.39804 4.177316
ethnicmajority | 32.59098 13.35653 2.44 0.015 6.412669 58.7693
urban | -39.49991 7.304811 -5.41 0.000 -53.81708 -25.18275
occ1 | 122.1594 20.36826 6.00 0.000 82.23833 162.0804
occ2 | 15.67076 15.412 1.02 0.309 -14.5362 45.87771
occ3 | 30.5377 10.90227 2.80 0.005 9.169654 51.90575
occ4 | 16.64603 10.22802 1.63 0.104 -3.400521 36.69259
occ5 | 23.50738 9.260587 2.54 0.011 5.356964 41.6578
occ6 | 47.06718 13.58978 3.46 0.001 20.4317 73.70266
occ7 | -9.415526 10.3078 -0.91 0.361 -29.61844 10.78739
region | -3.198695 1.362628 -2.35 0.019 -5.869396 -.5279938
_cons | 75.65066 19.33279 3.91 0.000 37.75909 113.5422
-------------------+----------------------------------------------------------------
sigma_u | 0
sigma_e | 2.5462105
rho | 0 (fraction of variance due to u_i)
------------------------------------------------------------------------------------

Then i used the hausman test which gave:

. hausman fe re

Note: the rank of the differenced variance matrix (17) does not equal the number of coefficients being tested (18); be
sure this is what you expect, or there may be problems computing the test. Examine the output of your estimators
for anything unexpected and possibly consider scaling your variables so that the coefficients are on a similar
scale.

---- Coefficients ----
| (b) (B) (b-B) sqrt(diag(V_b-V_B))
| fe re Difference Std. err.
-------------+----------------------------------------------------------------
age | .2117937 -2.091687 2.303481 .
female | 4.089059 9.237893 -5.148834 .
incouple | -4.89975 6.362442 -11.26219 .
lifesatisf~n | 2.587099 19.59183 -17.00474 .
highereduc~n | -11.23304 3.865053 -15.09809 .
children | -4.171664 -37.9613 33.78964 .
grossmonth~e | .001408 .0048509 -.0034429 .
generalhea~h | 1.908362 -11.11036 13.01872 .
ethnicmajo~y | 4.843447 32.59098 -27.74754 .
urban | -1.49449 -39.49991 38.00542 .
occ1 | -10.15136 122.1594 -132.3107 .
occ2 | 21.74343 15.67076 6.072671 .
occ3 | 12.9459 30.5377 -17.5918 .
occ4 | 7.096606 16.64603 -9.549428 .
occ5 | 6.077322 23.50738 -17.43006 .
occ6 | -4.054957 47.06718 -51.12214 .
occ7 | 7.553854 -9.415526 16.96938 .
region | -3.826225 -3.198695 -.62753 .
------------------------------------------------------------------------------
b = Consistent under H0 and Ha; obtained from xtreg.
B = Inconsistent under Ha, efficient under H0; obtained from xtreg.

Test of H0: Difference in coefficients not systematic

chi2(17) = (b-B)'[(V_b-V_B)^(-1)](b-B)
= -700.85

Warning: chi2 < 0 ==> model fitted on these data
fails to meet the asymptotic assumptions
of the Hausman test; see suest for a
generalized test.

so I used the sigmamore addition to give:

. hausman fe re, sigmamore

Note: the rank of the differenced variance matrix (15) does not equal the number of coefficients being tested (18); be
sure this is what you expect, or there may be problems computing the test. Examine the output of your estimators
for anything unexpected and possibly consider scaling your variables so that the coefficients are on a similar
scale.

---- Coefficients ----
| (b) (B) (b-B) sqrt(diag(V_b-V_B))
| fe re Difference Std. err.
-------------+----------------------------------------------------------------
age | .2117937 -2.091687 2.303481 .3062946
female | 4.089059 9.237893 -5.148834 10.79208
incouple | -4.89975 6.362442 -11.26219 5.678378
lifesatisf~n | 2.587099 19.59183 -17.00474 4.619213
highereduc~n | -11.23304 3.865053 -15.09809 12.73109
children | -4.171664 -37.9613 33.78964 9.673316
grossmonth~e | .001408 .0048509 -.0034429 .0013904
generalhea~h | 1.908362 -11.11036 13.01872 9.341224
ethnicmajo~y | 4.843447 32.59098 -27.74754 10.76502
urban | -1.49449 -39.49991 38.00542 9.706822
occ1 | -10.15136 122.1594 -132.3107 31.08965
occ2 | 21.74343 15.67076 6.072671 29.56226
occ3 | 12.9459 30.5377 -17.5918 18.20688
occ4 | 7.096606 16.64603 -9.549428 23.11918
occ5 | 6.077322 23.50738 -17.43006 19.42534
occ6 | -4.054957 47.06718 -51.12214 14.32678
occ7 | 7.553854 -9.415526 16.96938 18.45462
region | -3.826225 -3.198695 -.62753 .9598942
------------------------------------------------------------------------------
b = Consistent under H0 and Ha; obtained from xtreg.
B = Inconsistent under Ha, efficient under H0; obtained from xtreg.

Test of H0: Difference in coefficients not systematic

chi2(15) = (b-B)'[(V_b-V_B)^(-1)](b-B)
= 126.03
Prob > chi2 = 0.0000
(V_b-V_B is not positive definite)

This tells me that I should use the fixed effects model however as a precautionary I did the Breusch-Pagan Lagrange multiplier. This gave:

. xttest0

Breusch and Pagan Lagrangian multiplier test for random effects

outputperhourworked[industry,t] = Xb + u[industry] + e[industry,t]

Estimated results:
| Var SD = sqrt(Var)
---------+-----------------------------
outputp~d | 125.0593 11.18299
e | 6.483188 2.546211
u | 0 0

Test: Var(u) = 0
chibar2(01) = 0.00
Prob > chibar2 = 1.0000

From this then I checked the heterosckedasticity of the fixed effects model:

. xttest3

Modified Wald test for groupwise heteroskedasticity
in fixed effect regression model

H0: sigma(i)^2 = sigma^2 for all i

chi2 (16) = 1186.33
Prob>chi2 = 0.0000

So from this i decide to use the robust addition to the fixed effects model which gives :

. xtreg outputperhourworked age female incouple lifesatisfaction highereducation children grossmonthlyincome generalhealth
> ethnicmajority urban occ1-occ7 region, robust fe

Fixed-effects (within) regression Number of obs = 160
Group variable: industry Number of groups = 16

R-squared: Obs per group:
Within = 0.5879 min = 10
Between = 0.0036 avg = 10.0
Overall = 0.0261 max = 10

F(15,15) = .
corr(u_i, Xb) = -0.1426 Prob > F = .

(Std. err. adjusted for 16 clusters in industry)
------------------------------------------------------------------------------------
| Robust
outputperhourwor~d | Coefficient std. err. t P>|t| [95% conf. interval]
-------------------+----------------------------------------------------------------
age | .2117937 .1972323 1.07 0.300 -.208597 .6321844
female | 4.089059 4.012366 1.02 0.324 -4.463097 12.64121
incouple | -4.89975 4.403377 -1.11 0.283 -14.28533 4.485826
lifesatisfaction | 2.587099 4.156245 0.62 0.543 -6.271728 11.44593
highereducation | -11.23304 4.482184 -2.51 0.024 -20.78659 -1.679489
children | -4.171664 5.229833 -0.80 0.438 -15.31879 6.97546
grossmonthlyincome | .001408 .0008818 1.60 0.131 -.0004715 .0032875
generalhealth | 1.908362 5.440297 0.35 0.731 -9.687357 13.50408
ethnicmajority | 4.843447 4.641975 1.04 0.313 -5.050689 14.73758
urban | -1.49449 3.20119 -0.47 0.647 -8.317664 5.328684
occ1 | -10.15136 22.45018 -0.45 0.658 -58.00279 37.70007
occ2 | 21.74343 13.15912 1.65 0.119 -6.304562 49.79142
occ3 | 12.9459 7.491032 1.73 0.104 -3.020857 28.91266
occ4 | 7.096606 9.876709 0.72 0.483 -13.9551 28.14831
occ5 | 6.077322 6.362956 0.96 0.355 -7.484997 19.63964
occ6 | -4.054957 3.760967 -1.08 0.298 -12.07127 3.961355
occ7 | 7.553854 4.79697 1.57 0.136 -2.670645 17.77835
region | -3.826225 .7289406 -5.25 0.000 -5.379925 -2.272525
_cons | 9.833199 13.66569 0.72 0.483 -19.29454 38.96094
-------------------+----------------------------------------------------------------
sigma_u | 11.235124
sigma_e | 2.5462105
rho | .95114815 (fraction of variance due to u_i)
------------------------------------------------------------------------------------

The problem with this result is that the r-squared value is appropriate however i still receive the majority of the coefficients are still insignificant. I was wondering on whether to carry on to use this fixed effect regression or the pooled ols regression or the random effects model.
There is 16 panels but the data is aggregated for all the other variables and so it consists of many observations just aggregated to an indutsry level for 16 industries in 5 years.
Any help would be appreciated.
thank you
Comment

Carlo Lazzaro

Join Date: Apr 2014
Posts: 17851

27 Aug 2022, 01:05

Oliver:
please use CODE delimiters to share what you typed and what Stata gave you back. Thanks.
That said:
1) a pooled OLS without -i.industry- (and standard errors clustered on -industry-) as a predictor makes no sense at all;
2) assuming that you have a -repeated time values within panel- issue with -xtset-, it is not clear why you did not include -i.year- among the predictors of your -xtreg,fe- equation;
3) you have heteroskedastcity. Therefore, even though the number of clusters is low, it makes sense to go -xtreg,fe- with cluster-robust standard errors;
4) -xtreg,re- is aout of debate as per -xttest0- outcome;
5) what you shoud do is to test the correct specification of the functional form of the regressand in your -xtreg,fe- model, as per the following toy-example:

Code:

. xtreg ln_wage fitted sq_fitted , fe vce(cluster idcode)

Fixed-effects (within) regression               Number of obs     =     28,510
Group variable: idcode                          Number of groups  =      4,710

R-squared:                                      Obs per group:
     Within  = 0.1092                                         min =          1
     Between = 0.1033                                         avg =        6.1
     Overall = 0.0881                                         max =         15

                                                F(2,4709)         =     523.09
corr(u_i, Xb) = 0.0467                          Prob > F          =     0.0000

                             (Std. err. adjusted for 4,710 clusters in idcode)
------------------------------------------------------------------------------
             |               Robust
     ln_wage | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
      fitted |   2.569185   .7085064     3.63   0.000     1.180181    3.958189
   sq_fitted |    -.47432   .2153021    -2.20   0.028    -.8964128   -.0522272
       _cons |  -1.290258    .580562    -2.22   0.026    -2.428431   -.1520844
-------------+----------------------------------------------------------------
     sigma_u |    .403403
     sigma_e |  .30238578
         rho |  .64025357   (fraction of variance due to u_i)
------------------------------------------------------------------------------

.

As -sq_fitted- reaches statitsical significance, the model is misspecified.

Kind regards,
Carlo
(Stata 19.0)

Comment

Oliver Brennan

Join Date: Aug 2022

Posts: 4
#5

27 Aug 2022, 04:08

Now, this is my pooled ols regression including industries:

Code:

reg outputperhourworked age female incouple lifesatisfaction highereducation children grossmonthlyincome generalhealth ethnicmajority urban ind1-ind15

As you said I have included the i.wave in the xtreg, fe equation which gives:

Code:

xtreg outputperhourworked age female incouple lifesatisfaction highereducation children grossmonthlyincome generalhealth ethnicmajority urban region i.wave, fe

Then finally used xtreg, fe with robust standard errors which gives:

Code:

xtreg outputperhourworked age female incouple lifesatisfaction highereducation children grossmonthlyincome generalhealth ethnicmajority urban region i.wave, robust fe

I'm sorry but I'm really stuck as to whether to just carry on with this regression and explain it's floors or try and fix it as much as possible or drop variables to help improve its consistency.
Thanks
Comment

Oliver Brennan

Join Date: Aug 2022
Posts: 4

27 Aug 2022, 04:13

Now, this is my pooled ols regression including industries:

Code:

reg outputperhourworked age female incouple lifesatisfaction highereducation children grossmonthlyincome generalhealth ethnicmajority urban ind1-ind15
      Source |       SS           df       MS      Number of obs   =       160
-------------+----------------------------------   F(25, 134)      =     80.10
       Model |  18637.3043        25  745.492173   Prob > F        =    0.0000
    Residual |  1247.12528       134  9.30690506   R-squared       =    0.9373
-------------+----------------------------------   Adj R-squared   =    0.9256
       Total |  19884.4296       159  125.059306   Root MSE        =    3.0507

------------------------------------------------------------------------------------
outputperhourwor~d | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------------+----------------------------------------------------------------
               age |   .2748922   .1749598     1.57   0.119    -.0711477    .6209322
            female |   4.790014    3.91466     1.22   0.223    -2.952502    12.53253
          incouple |  -5.923835   4.461654    -1.33   0.187    -14.74821    2.900539
  lifesatisfaction |  -.4209222   3.204008    -0.13   0.896    -6.757891    5.916046
   highereducation |  -11.82595   5.525567    -2.14   0.034    -22.75456   -.8973413
          children |  -10.10135   4.291029    -2.35   0.020    -18.58825   -1.614438
grossmonthlyincome |   .0036281    .000617     5.88   0.000     .0024077    .0048484
     generalhealth |    6.18834   3.767032     1.64   0.103    -1.262192    13.63887
    ethnicmajority |    2.55508   6.374079     0.40   0.689    -10.05174     15.1619
             urban |  -15.33324   3.824228    -4.01   0.000    -22.89689   -7.769579
              ind1 |   27.62078   3.018881     9.15   0.000     21.64996     33.5916
              ind2 |   15.72706    2.87707     5.47   0.000     10.03672    21.41741
              ind3 |   5.812294   3.181078     1.83   0.070    -.4793249    12.10391
              ind4 |   36.17205   3.068046    11.79   0.000     30.10399    42.24011
              ind5 |   10.15522   2.437754     4.17   0.000     5.333771    14.97668
              ind6 |   4.281885   2.217484     1.93   0.056    -.1039126    8.667683
              ind7 |   4.581402   2.807222     1.63   0.105    -.9707948     10.1336
              ind8 |  -.7753631   2.342294    -0.33   0.741    -5.408012    3.857286
              ind9 |   28.39202   3.318566     8.56   0.000     21.82848    34.95557
             ind10 |   9.816667   3.159917     3.11   0.002     3.566901    16.06643
             ind11 |    1.28196   2.414443     0.53   0.596    -3.493386    6.057307
             ind12 |   10.04032   2.876909     3.49   0.001     4.350295    15.73034
             ind13 |   11.14044   3.174646     3.51   0.001     4.861549    17.41934
             ind14 |   2.915025   2.466157     1.18   0.239    -1.962603    7.792654
             ind15 |   2.553465   3.163255     0.81   0.421    -3.702902    8.809833
             _cons |   7.472767   11.23417     0.67   0.507    -14.74646    29.69199
------------------------------------------------------------------------------------

As you said I have included the i.wave in the xtreg, fe equation which gives:

Code:

xtreg outputperhourworked age female incouple lifesatisfaction highereducation children grossmonthlyincome generalhealth ethnicmajority urban region i.year, fe
Fixed-effects (within) regression               Number of obs     =        160
Group variable: industry                        Number of groups  =         16

R-squared:                                      Obs per group:
     Within  = 0.5455                                         min =         10
     Between = 0.0263                                         avg =       10.0
     Overall = 0.0142                                         max =         10

                                                F(15,129)         =      10.32
corr(u_i, Xb) = -0.1565                         Prob > F          =     0.0000

------------------------------------------------------------------------------------
outputperhourwor~d | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------------+----------------------------------------------------------------
               age |   .0949945   .1676812     0.57   0.572    -.2367669    .4267559
            female |    3.13867   3.446668     0.91   0.364    -3.680647    9.957988
          incouple |  -5.931795   4.000349    -1.48   0.141    -13.84658    1.982993
  lifesatisfaction |   2.916755   3.299943     0.88   0.378    -3.612262    9.445773
   highereducation |  -10.71285    4.94944    -2.16   0.032    -20.50544   -.9202642
          children |  -5.001213    3.96734    -1.26   0.210    -12.85069    2.848267
grossmonthlyincome |   .0016221   .0006904     2.35   0.020     .0002562     .002988
     generalhealth |   3.680417   3.441799     1.07   0.287    -3.129267     10.4901
    ethnicmajority |   11.20793   5.736378     1.95   0.053    -.1416344    22.55749
             urban |  -4.849715   3.683193    -1.32   0.190      -12.137    2.437572
            region |  -3.934039   .6097958    -6.45   0.000    -5.140535   -2.727543
                   |
              year |
             2010  |   .1648045   .6929738     0.24   0.812    -1.206261     1.53587
             2011  |  -.2509595   .7855941    -0.32   0.750    -1.805277    1.303358
             2012  |  -.0724003   .7496864    -0.10   0.923    -1.555673    1.410873
             2013  |   .0113145   .8777164     0.01   0.990    -1.725269    1.747898
                   |
             _cons |   17.80361    9.07735     1.96   0.052    -.1561492    35.76337
-------------------+----------------------------------------------------------------
           sigma_u |   11.30491
           sigma_e |  2.6429505
               rho |  .94817579   (fraction of variance due to u_i)
------------------------------------------------------------------------------------
F test that all u_i=0: F(15, 129) = 102.29                   Prob > F = 0.0000

Then finally used xtreg, fe with robust standard errors which gives:

Code:

xtreg outputperhourworked age female incouple lifesatisfaction highereducation children grossmonthlyincome generalhealth ethnicmajority urban region i.year, robust fe

Fixed-effects (within) regression               Number of obs     =        160
Group variable: industry                        Number of groups  =         16

R-squared:                                      Obs per group:
     Within  = 0.5455                                         min =         10
     Between = 0.0263                                         avg =       10.0
     Overall = 0.0142                                         max =         10

                                                F(15,15)          =    2459.22
corr(u_i, Xb) = -0.1565                         Prob > F          =     0.0000

                                    (Std. err. adjusted for 16 clusters in industry)
------------------------------------------------------------------------------------
                   |               Robust
outputperhourwor~d | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
-------------------+----------------------------------------------------------------
               age |   .0949945   .1587471     0.60   0.559     -.243367     .433356
            female |    3.13867   5.518367     0.57   0.578    -8.623451    14.90079
          incouple |  -5.931795   4.444093    -1.33   0.202    -15.40415    3.540565
  lifesatisfaction |   2.916755   4.074887     0.72   0.485     -5.76866    11.60217
   highereducation |  -10.71285   5.047343    -2.12   0.051    -21.47101    .0453057
          children |  -5.001213   5.170988    -0.97   0.349    -16.02291    6.020486
grossmonthlyincome |   .0016221   .0011311     1.43   0.172    -.0007888     .004033
     generalhealth |   3.680417   5.102521     0.72   0.482    -7.195348    14.55618
    ethnicmajority |   11.20793   9.162996     1.22   0.240    -8.322535    30.73839
             urban |  -4.849715   2.958943    -1.64   0.122    -11.15655    1.457123
            region |  -3.934039   .6250035    -6.29   0.000    -5.266202   -2.601876
                   |
              year |
             2010  |   .1648045    .842043     0.20   0.847    -1.629968    1.959577
             2011  |  -.2509595   .6032141    -0.42   0.683     -1.53668    1.034761
             2012  |  -.0724003   .7672533    -0.09   0.926    -1.707762    1.562961
             2013  |   .0113145   1.008143     0.01   0.991    -2.137492    2.160121
                   |
             _cons |   17.80361   16.98761     1.05   0.311    -18.40461    54.01183
-------------------+----------------------------------------------------------------
           sigma_u |   11.30491
           sigma_e |  2.6429505
               rho |  .94817579   (fraction of variance due to u_i)
------------------------------------------------------------------------------------

I'm sorry but I'm really stuck as to whether to just carry on with this regression and explain it's floors or try and fix it as much as possible or drop variables to help improve its consistency. Any advice would be helpful.
Thanks

Comment

Carlo Lazzaro

Join Date: Apr 2014

Posts: 17851
#7

27 Aug 2022, 06:59

Oliver:
1) your pooled OLS possibly has a dummy in excess (ind1-ind15; or: have you got 16 -ind*- dummies and you've already omitted one of them?) and lacks of robust standard error if you detect heteroskedasticity. I'd not consider clustered standard errors here as, with 16 panels only, they may be more misleading that their default counterparts;
2) your -fe- regression has an overfitting problem (the F-test is highly significant, while most of your coefficients are not). Try a more parsimonious model and see whether things get better.

Kind regards,
Carlo
(Stata 19.0)
Comment

Announcement