Fixed, Random or Mixed model?

Maria Petraki

Join Date: Aug 2024

Posts: 6
#1

Fixed, Random or Mixed model?

26 Aug 2024, 07:46

Hi all,

I am new to panel data regression analysis and Stata, so forgive me if my questions are too elementary. I have some questions about the use of random or fixed effect models and using the correct estimators.

Data characteristics:
- Panel data
- Balanced
- T (2009-2023) and N =17 European countries (T<N)
- Dependent variable is Self-rated bad or very bad health (SPHBVB)
- Independent variables are: Temporary employment (TEMPEMPL), Part time employment (PARTIME), self-employment (without employees (SELFEMPLnoEMPLs) and UNEMPLOYMENT. All variables in % .

Aim of analysis:
- To perform a regression analysis that is efficient and consistent under robustness tests

Method:
Perform OLS / FE and RE models or Mixed

1. First I checked the correlation with all variables

[IMG]file:///C:/Users/HP/AppData/Local/Temp/msohtmlclip1/01/clip_image002.png[/IMG]

Comment: Highest effect on dependent SELFEMPLnoEMPLs and PARTTIME. For independent variables, higher correlation between SELFEMPLnoEMPLs and PARTIME.

2. Then I did fixed and random regressions

In both cases only the variable SELFEMPLnoEMPLs was statistically significant.

3.Then, I conducted Breusch-Pagan test (xttest0) that showed: Prob>chibar2 = 0.0000 and RE is chosen over OLS.

[IMG]file:///C:/Users/HP/AppData/Local/Temp/msohtmlclip1/01/clip_image004.png[/IMG]

4.Tests: I checked my data for autocorrelation using serial and heteroscedasticity using xttest2 which shows that my data suffers from both problems.

a. xtserial SPHBVB TEMPEMPL PARTIME SELFEMPLnoEMPLs UNEMPLOYMENT

Wooldridge test for autocorrelation in panel data
H0: no first order autocorrelation
F( 1, 16) = 17.066
Prob > F = 0.0008

b. xttest2
Breusch-Pagan LM test of independence: chi2(136) = 216.828, Pr = 0.0000
Based on 15 complete observations over panel units

c. Also VIF and 1/VIF were satisfactory
[IMG]file:///C:/Users/HP/AppData/Local/Temp/msohtmlclip1/01/clip_image006.png[/IMG]
5.After reading other questions on this site, I found that you can deal with autocorrelation and heteroscedasticity by clustering the standard errors. So i did the re effect regression using the cluster command. Only the variable SELFEMPLnoEMPLs shows to be significant.

[IMG]file:///C:/Users/HP/AppData/Local/Temp/msohtmlclip1/01/clip_image008.png[/IMG]

xtoverid

Test of overidentifying restrictions: fixed vs random effects
Cross-section time-series model: xtreg re robust cluster(id)
Sargan-Hansen statistic 3.062 Chi-sq(4) P-value = 0.5475

6. I omitted variables and after various efforts I found that only PARTTIME and SELFEMPLnoEMPLs are significant, although R-square decreased after omitting TEMPEMPL and then UNEMPLOYMENT
[IMG]file:///C:/Users/HP/AppData/Local/Temp/msohtmlclip1/01/clip_image010.png[/IMG]

xtoverid
Test of overidentifying restrictions: fixed vs random effects
Cross-section time-series model: xtreg re robust cluster(CODE)
Sargan-Hansen statistic 0.406 Chi-sq(2) P-value = 0.8161

So, xtoverid- output tells that -re- model is the way to go.

Questions

1. Do I accept the last findings (6) as the most appropriate and also suitable with theoretical models or discuss and accept (5):?

To finish, two last questions, very crucial for my research.
In many similar studies I found that social scientists are using ad-hoc a fixed model or at least present it at a comparative perspective although finally re effect model is provided to be proper.

IIa. I do appreciate for help, for some comments for the below finding (variable Welfare3 refer to three groups of countries. Group 3 include only one country; Group 1 include 12 countries; and group 2 has 4 countries)

[IMG]file:///C:/Users/HP/AppData/Local/Temp/msohtmlclip1/01/clip_image012.png[/IMG]

IIb. Is it more appropriate to use a mixed model, e.g.
[IMG]file:///C:/Users/HP/AppData/Local/Temp/msohtmlclip1/01/clip_image014.png[/IMG]
Or

[IMG]file:///C:/Users/HP/AppData/Local/Temp/msohtmlclip1/01/clip_image016.png[/IMG]

Thank you,
Maria
Attached Files

LETTER TO STATA_MARIA_PETRAKH_26.8.2024.pdf (185.3 KB, 1 view)

Last edited by Maria Petraki; 26 Aug 2024, 07:53.
Tags: None
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17740
#2

26 Aug 2024, 08:41

Maria:
welcome to this forum.
Please see the FAQ on how to post (much more) effectively. Thanks.
Just in case, see also https://www.statalist.org/forums/help#adviceextras #4.
Tha said, I would go as per your point #5, not caring for what reaches statistical significance.

Kind regards,
Carlo
(Stata 19.0)
Comment
Maria Petraki

Join Date: Aug 2024

Posts: 6
#3

27 Aug 2024, 00:40

Dear Carlo,

Thank you very much for your advice on my previous questions.
I have one more question: How can I provide a comparison concerning the grouping of the 17 countries into 3 groups?
May I also present the following results?

Code:

xtreg SPHBVB TEMPEMPL PARTIME SELFEMPLnoEMPLs UNEMPLOYMENT i.WELFARE3, re vce(robust)

Thank you in advance,
Maria
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17740
#4

27 Aug 2024, 01:18

Maria:
as per FAQ, an excerpt/example of your dataset would be welcomed.
That said, if -welfare3- actually categorizes the 17 countries into 3 groups, you can go that way and see what happens.

Kind regards,
Carlo
(Stata 19.0)
Comment

Maria Petraki

Join Date: Aug 2024
Posts: 6

27 Aug 2024, 07:37

Dear Carlo,

Thank you very much for your help so far. Please note that below, the Welfare 3 variable refers to three groups of countries (Group 1 includes 12 countries, Group 2 includes 4 countries, and Group 3 includes only one country).

Code:

. xtreg SPHBVB TEMPEMPL PARTIME SELFEMPLnoEMPLs UNEMPLOYMENT i.WELFARE3, re vce(
> robust)

Random-effects GLS regression                   Number of obs     =        255
Group variable: id                              Number of groups  =         17

R-squared:                                      Obs per group:
     Within  = 0.0517                                         min =         15
     Between = 0.5942                                         avg =       15.0
     Overall = 0.4074                                         max =         15

                                                Wald chi2(6)      =      71.40
corr(u_i, X) = 0 (assumed)                      Prob > chi2       =     0.0000

                                     (Std. err. adjusted for 17 clusters in id)
-------------------------------------------------------------------------------
              |               Robust
       SPHBVB | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
--------------+----------------------------------------------------------------
     TEMPEMPL |   -.042982   .0390567    -1.10   0.271    -.1195317    .0335677
      PARTIME |   .0238397    .028361     0.84   0.401    -.0317468    .0794261
SELFEMPLnoE~s |  -.2973076   .0842092    -3.53   0.000    -.4623547   -.1322606
 UNEMPLOYMENT |    .071698    .042335     1.69   0.090    -.0112771    .1546731
              |
     WELFARE3 |
           2  |  -3.000322   1.177964    -2.55   0.011     -5.30909   -.6915548
           3  |   2.199055   2.146342     1.02   0.306    -2.007699    6.405808
              |
        _cons |    8.49148   1.649644     5.15   0.000     5.258237    11.72472
--------------+----------------------------------------------------------------
      sigma_u |  2.3262191
      sigma_e |  2.2571891
          rho |  .51505743   (fraction of variance due to u_i)
-------------------------------------------------------------------------------

Kind regards,
Maria

Last edited by Maria Petraki; 27 Aug 2024, 07:49.

Comment

Maria Petraki

Join Date: Aug 2024

Posts: 6
#6

27 Aug 2024, 07:47

I hope my previous message is ok.

Maria

Last edited by Maria Petraki; 27 Aug 2024, 07:51.
Comment

Carlo Lazzaro

Join Date: Apr 2014
Posts: 17740

27 Aug 2024, 08:04

Maria:
thanks for using CODE deliters, first.
Some comments about your post follow:
1) I assume that your already ran -xttest0- after -xtreg,re-;
2) clustering standard errors with less than 30 panels is possibly misleading. Go default standard errors and check the dfference between the two types of standard errors;
3) you should plan to use -test- on WELFARE3 variable levels;
4) you're recommended to replicate by hand the -linktest- procedure to investigate the possible misspecification of the functional form of the regressand (basically, a way to explore if you're model is correctly specified).
The following toy-example use a Stata example dataset to work out points 2)-4):

Code:

. use "https://www.stata-press.com/data/r18/nlswork.dta"
(National Longitudinal Survey of Young Women, 14-24 years old in 1968)

. xtreg ln_wage i.race c.age##c.age, re rob

Random-effects GLS regression                   Number of obs     =     28,510
Group variable: idcode                          Number of groups  =      4,710

R-squared:                                      Obs per group:
     Within  = 0.1087                                         min =          1
     Between = 0.1175                                         avg =        6.1
     Overall = 0.1048                                         max =         15

                                                Wald chi2(4)      =    1354.70
corr(u_i, X) = 0 (assumed)                      Prob > chi2       =     0.0000

                             (Std. err. adjusted for 4,710 clusters in idcode)
------------------------------------------------------------------------------
             |               Robust
     ln_wage | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
        race |
      Black  |  -.1237269   .0126612    -9.77   0.000    -.1485424   -.0989114
      Other  |   .0965773   .0613496     1.57   0.115    -.0236657    .2168203
             |
         age |   .0594573   .0041032    14.49   0.000     .0514151    .0674995
             |
 c.age#c.age |  -.0006835   .0000688    -9.94   0.000    -.0008182   -.0005487
             |
       _cons |   .5761164   .0586669     9.82   0.000     .4611314    .6911015
-------------+----------------------------------------------------------------
     sigma_u |  .36094993
     sigma_e |  .30245467
         rho |   .5874941   (fraction of variance due to u_i)
------------------------------------------------------------------------------

. mat list e(b)

e(b)[1,6]
            1b.          2.          3.                  c.age#            
          race        race        race         age       c.age       _cons
y1           0  -.12372688   .09657728   .05945731  -.00068348   .57611643

. test 1b.race=3.race

 ( 1)  1b.race - 3.race = 0

           chi2(  1) =    2.48
         Prob > chi2 =    0.1154

. test 1b.race=3.race

 ( 1)  1b.race - 3.race = 0

           chi2(  1) =    2.48
         Prob > chi2 =    0.1154

. predict fitted, xb
(24 missing values generated)

. g sq_fitted=fitted^2
(24 missing values generated)

. xtreg ln_wage fitted sq_fitted , re rob

Random-effects GLS regression                   Number of obs     =     28,510
Group variable: idcode                          Number of groups  =      4,710

R-squared:                                      Obs per group:
     Within  = 0.1092                                         min =          1
     Between = 0.1182                                         avg =        6.1
     Overall = 0.1056                                         max =         15

                                                Wald chi2(2)      =    1377.13
corr(u_i, X) = 0 (assumed)                      Prob > chi2       =     0.0000

                             (Std. err. adjusted for 4,710 clusters in idcode)
------------------------------------------------------------------------------
             |               Robust
     ln_wage | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
      fitted |   2.372522   .4962628     4.78   0.000     1.399865    3.345179
   sq_fitted |  -.4193389   .1527897    -2.74   0.006    -.7188013   -.1198766
       _cons |  -1.114001   .4010496    -2.78   0.005    -1.900044   -.3279582
-------------+----------------------------------------------------------------
     sigma_u |  .36252299
     sigma_e |  .30238279
         rho |  .58971525   (fraction of variance due to u_i)
------------------------------------------------------------------------------

. test sq_fitted

 ( 1)  sq_fitted = 0

           chi2(  1) =    7.53
         Prob > chi2 =    0.0061

.

Kind regards,
Carlo
(Stata 19.0)

Comment

Maria Petraki

Join Date: Aug 2024
Posts: 6

29 Aug 2024, 08:41

Dear Carlo,

Thanks for the clear and detailed comments.
Concerning your previous comment 1, I have already ran -xttest0- after -xtreg,re-; That showed: Prob>chibar2 = 0.0000 and re was chosen over OLS.
Regarding the use of “test” on WELFARE3 variable levels (comment 3), I am unsure if I did the right procedure and what the results indicate.

Code:

. xtreg SPHBVB TEMPEMPL PARTIME SELFEMPLnoEMPLs UNEMPLOYMENT i.WELFARE3, re vce(robust)

Random-effects GLS regression                   Number of obs     =        255
Group variable: id                              Number of groups  =         17

R-squared:                                      Obs per group:
     Within  = 0.0517                                         min =         15
     Between = 0.5942                                         avg =       15.0
     Overall = 0.4074                                         max =         15

                                                Wald chi2(6)      =      71.40
corr(u_i, X) = 0 (assumed)                      Prob > chi2       =     0.0000

                                       (Std. err. adjusted for 17 clusters in id)
---------------------------------------------------------------------------------
                |               Robust
         SPHBVB | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
----------------+----------------------------------------------------------------
       TEMPEMPL |   -.042982   .0390567    -1.10   0.271    -.1195317    .0335677
        PARTIME |   .0238397    .028361     0.84   0.401    -.0317468    .0794261
SELFEMPLnoEMPLs |  -.2973076   .0842092    -3.53   0.000    -.4623547   -.1322606
   UNEMPLOYMENT |    .071698    .042335     1.69   0.090    -.0112771    .1546731
                |
       WELFARE3 |
             2  |  -3.000322   1.177964    -2.55   0.011     -5.30909   -.6915548
             3  |   2.199055   2.146342     1.02   0.306    -2.007699    6.405808
                |
          _cons |    8.49148   1.649644     5.15   0.000     5.258237    11.72472
----------------+----------------------------------------------------------------
        sigma_u |  2.3262191
        sigma_e |  2.2571891
            rho |  .51505743   (fraction of variance due to u_i)
---------------------------------------------------------------------------------

. test i2.WELFARE3

 ( 1)  2.WELFARE3 = 0

           chi2(  1) =    6.49
         Prob > chi2 =    0.0109

. test i3.WELFARE3

 ( 1)  3.WELFARE3 = 0

           chi2(  1) =    1.05
         Prob > chi2 =    0.3056

Also, I followed step by step-without any serious mistake, hopefully, and applied the test on the basis of your Stata example. The last finding -test- outcome ( Prob > chi2 = 0.3920), points towards an acceptable model (?).

Code:

. test 1b.WELFARE3=3.WELFARE3

 ( 1)  1b.WELFARE3 - 3.WELFARE3 = 0

           chi2(  1) =    1.05
         Prob > chi2 =    0.3056

. test 1b.WELFARE3=3.WELFARE3

 ( 1)  1b.WELFARE3 - 3.WELFARE3 = 0

           chi2(  1) =    1.05
         Prob > chi2 =    0.3056

. 
. 
. predict fitted, xb

. g sq_fitted=fitted^2

. xtreg SPHBVB fitted sq_fitted , re rob

Random-effects GLS regression                   Number of obs     =        255
Group variable: id                              Number of groups  =         17

R-squared:                                      Obs per group:
     Within  = 0.0478                                         min =         15
     Between = 0.6297                                         avg =       15.0
     Overall = 0.4296                                         max =         15

                                                Wald chi2(2)      =      42.38
corr(u_i, X) = 0 (assumed)                      Prob > chi2       =     0.0000

                                    (Std. err. adjusted for 17 clusters in id)
------------------------------------------------------------------------------
             |               Robust
      SPHBVB | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
      fitted |   .5476712     .55013     1.00   0.319    -.5305639    1.625906
   sq_fitted |   .0498526   .0582377     0.86   0.392    -.0642912    .1639963
       _cons |   .7105321   1.175404     0.60   0.546    -1.593217    3.014281
-------------+----------------------------------------------------------------
     sigma_u |  1.6868596
     sigma_e |  2.2484105
         rho |  .36015074   (fraction of variance due to u_i)
------------------------------------------------------------------------------

. test sq_fitted

 ( 1)  sq_fitted = 0

           chi2(  1) =    0.73
         Prob > chi2 =    0.3920

.

Two additional questions:
1. If the WELFARE3 variable, according to the “test” results, is not appropriate for the model, may I then present the differences based on a pooled regression model? Thus, presenting both the random effect and in addition pooled regression for the purpose of presenting differences according to WELFARE3?
2. May I attempt to estimate a mixed model as a better alternative?

Kind regards,
Maria

Comment

Carlo Lazzaro

Join Date: Apr 2014

Posts: 17740
#9

29 Aug 2024, 11:09

Maria:
1) your -test- (as expected) repeat (with one more decimal digit) the p-values already reported in the regression outcome table;
2) as per the results of auxiloiary regression, your model shows no evidence of misspecification;
3) I would stick with -xtreg,re- and forget your queries # 1) and 2).

Kind regards,
Carlo
(Stata 19.0)
Comment
Maria Petraki

Join Date: Aug 2024

Posts: 6
#10

30 Aug 2024, 00:59

Dear Carlo,

Thank you very much for your advice. Your support was immensely valuable.

Kind regards,
Maria
Comment

Announcement