Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Fixed, Random or Mixed model?

    Hi all,


    I am new to panel data regression analysis and Stata, so forgive me if my questions are too elementary. I have some questions about the use of random or fixed effect models and using the correct estimators.

    Data characteristics:
    - Panel data
    - Balanced
    - T (2009-2023) and N =17 European countries (T<N)
    - Dependent variable is Self-rated bad or very bad health (SPHBVB)
    - Independent variables are: Temporary employment (TEMPEMPL), Part time employment (PARTIME), self-employment (without employees (SELFEMPLnoEMPLs) and UNEMPLOYMENT. All variables in % .

    Aim of analysis:
    - To perform a regression analysis that is efficient and consistent under robustness tests

    Method:
    Perform OLS / FE and RE models or Mixed

    1. First I checked the correlation with all variables

    [IMG]file:///C:/Users/HP/AppData/Local/Temp/msohtmlclip1/01/clip_image002.png[/IMG]

    Comment: Highest effect on dependent SELFEMPLnoEMPLs and PARTTIME. For independent variables, higher correlation between SELFEMPLnoEMPLs and PARTIME.

    2. Then I did fixed and random regressions

    In both cases only the variable SELFEMPLnoEMPLs was statistically significant.

    3.Then, I conducted Breusch-Pagan test (xttest0) that showed: Prob>chibar2 = 0.0000 and RE is chosen over OLS.

    [IMG]file:///C:/Users/HP/AppData/Local/Temp/msohtmlclip1/01/clip_image004.png[/IMG]


    4.Tests: I checked my data for autocorrelation using serial and heteroscedasticity using xttest2 which shows that my data suffers from both problems.

    a. xtserial SPHBVB TEMPEMPL PARTIME SELFEMPLnoEMPLs UNEMPLOYMENT

    Wooldridge test for autocorrelation in panel data
    H0: no first order autocorrelation
    F( 1, 16) = 17.066
    Prob > F = 0.0008

    b. xttest2
    Breusch-Pagan LM test of independence: chi2(136) = 216.828, Pr = 0.0000
    Based on 15 complete observations over panel units


    c. Also VIF and 1/VIF were satisfactory
    [IMG]file:///C:/Users/HP/AppData/Local/Temp/msohtmlclip1/01/clip_image006.png[/IMG]
    5.After reading other questions on this site, I found that you can deal with autocorrelation and heteroscedasticity by clustering the standard errors. So i did the re effect regression using the cluster command. Only the variable SELFEMPLnoEMPLs shows to be significant.

    [IMG]file:///C:/Users/HP/AppData/Local/Temp/msohtmlclip1/01/clip_image008.png[/IMG]

    xtoverid

    Test of overidentifying restrictions: fixed vs random effects
    Cross-section time-series model: xtreg re robust cluster(id)
    Sargan-Hansen statistic 3.062 Chi-sq(4) P-value = 0.5475

    6. I omitted variables and after various efforts I found that only PARTTIME and SELFEMPLnoEMPLs are significant, although R-square decreased after omitting TEMPEMPL and then UNEMPLOYMENT
    [IMG]file:///C:/Users/HP/AppData/Local/Temp/msohtmlclip1/01/clip_image010.png[/IMG]


    xtoverid
    Test of overidentifying restrictions: fixed vs random effects
    Cross-section time-series model: xtreg re robust cluster(CODE)
    Sargan-Hansen statistic 0.406 Chi-sq(2) P-value = 0.8161

    So, xtoverid- output tells that -re- model is the way to go.

    Questions

    1. Do I accept the last findings (6) as the most appropriate and also suitable with theoretical models or discuss and accept (5):?

    To finish, two last questions, very crucial for my research.
    In many similar studies I found that social scientists are using ad-hoc a fixed model or at least present it at a comparative perspective although finally re effect model is provided to be proper.

    IIa. I do appreciate for help, for some comments for the below finding (variable Welfare3 refer to three groups of countries. Group 3 include only one country; Group 1 include 12 countries; and group 2 has 4 countries)

    [IMG]file:///C:/Users/HP/AppData/Local/Temp/msohtmlclip1/01/clip_image012.png[/IMG]

    IIb. Is it more appropriate to use a mixed model, e.g.
    [IMG]file:///C:/Users/HP/AppData/Local/Temp/msohtmlclip1/01/clip_image014.png[/IMG]
    Or


    [IMG]file:///C:/Users/HP/AppData/Local/Temp/msohtmlclip1/01/clip_image016.png[/IMG]


    Thank you,
    Maria
    Attached Files
    Last edited by Maria Petraki; 26 Aug 2024, 07:53.

  • #2
    Maria:
    welcome to this forum.
    Please see the FAQ on how to post (much more) effectively. Thanks.
    Just in case, see also https://www.statalist.org/forums/help#adviceextras #4.
    Tha said, I would go as per your point #5, not caring for what reaches statistical significance.
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Dear Carlo,

      Thank you very much for your advice on my previous questions.
      I have one more question: How can I provide a comparison concerning the grouping of the 17 countries into 3 groups?
      May I also present the following results?

      Code:
      xtreg SPHBVB TEMPEMPL PARTIME SELFEMPLnoEMPLs UNEMPLOYMENT
      i.WELFARE3, re vce(robust)
      Thank you in advance,
      Maria

      Comment


      • #4
        Maria:
        as per FAQ, an excerpt/example of your dataset would be welcomed.
        That said, if -welfare3- actually categorizes the 17 countries into 3 groups, you can go that way and see what happens.
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment


        • #5
          Dear Carlo,

          Thank you very much for your help so far. Please note that below, the Welfare 3 variable refers to three groups of countries (Group 1 includes 12 countries, Group 2 includes 4 countries, and Group 3 includes only one country).

          Code:
          . xtreg SPHBVB TEMPEMPL PARTIME SELFEMPLnoEMPLs UNEMPLOYMENT i.WELFARE3, re vce(
          > robust)
          
          Random-effects GLS regression                   Number of obs     =        255
          Group variable: id                              Number of groups  =         17
          
          R-squared:                                      Obs per group:
               Within  = 0.0517                                         min =         15
               Between = 0.5942                                         avg =       15.0
               Overall = 0.4074                                         max =         15
          
                                                          Wald chi2(6)      =      71.40
          corr(u_i, X) = 0 (assumed)                      Prob > chi2       =     0.0000
          
                                               (Std. err. adjusted for 17 clusters in id)
          -------------------------------------------------------------------------------
                        |               Robust
                 SPHBVB | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
          --------------+----------------------------------------------------------------
               TEMPEMPL |   -.042982   .0390567    -1.10   0.271    -.1195317    .0335677
                PARTIME |   .0238397    .028361     0.84   0.401    -.0317468    .0794261
          SELFEMPLnoE~s |  -.2973076   .0842092    -3.53   0.000    -.4623547   -.1322606
           UNEMPLOYMENT |    .071698    .042335     1.69   0.090    -.0112771    .1546731
                        |
               WELFARE3 |
                     2  |  -3.000322   1.177964    -2.55   0.011     -5.30909   -.6915548
                     3  |   2.199055   2.146342     1.02   0.306    -2.007699    6.405808
                        |
                  _cons |    8.49148   1.649644     5.15   0.000     5.258237    11.72472
          --------------+----------------------------------------------------------------
                sigma_u |  2.3262191
                sigma_e |  2.2571891
                    rho |  .51505743   (fraction of variance due to u_i)
          -------------------------------------------------------------------------------
          Kind regards,
          Maria
          Last edited by Maria Petraki; 27 Aug 2024, 07:49.

          Comment


          • #6
            I hope my previous message is ok.

            Maria
            Last edited by Maria Petraki; 27 Aug 2024, 07:51.

            Comment


            • #7
              Maria:
              thanks for using CODE deliters, first.
              Some comments about your post follow:
              1) I assume that your already ran -xttest0- after -xtreg,re-;
              2) clustering standard errors with less than 30 panels is possibly misleading. Go default standard errors and check the dfference between the two types of standard errors;
              3) you should plan to use -test- on WELFARE3 variable levels;
              4) you're recommended to replicate by hand the -linktest- procedure to investigate the possible misspecification of the functional form of the regressand (basically, a way to explore if you're model is correctly specified).
              The following toy-example use a Stata example dataset to work out points 2)-4):
              Code:
              . use "https://www.stata-press.com/data/r18/nlswork.dta"
              (National Longitudinal Survey of Young Women, 14-24 years old in 1968)
              
              . xtreg ln_wage i.race c.age##c.age, re rob
              
              Random-effects GLS regression                   Number of obs     =     28,510
              Group variable: idcode                          Number of groups  =      4,710
              
              R-squared:                                      Obs per group:
                   Within  = 0.1087                                         min =          1
                   Between = 0.1175                                         avg =        6.1
                   Overall = 0.1048                                         max =         15
              
                                                              Wald chi2(4)      =    1354.70
              corr(u_i, X) = 0 (assumed)                      Prob > chi2       =     0.0000
              
                                           (Std. err. adjusted for 4,710 clusters in idcode)
              ------------------------------------------------------------------------------
                           |               Robust
                   ln_wage | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
              -------------+----------------------------------------------------------------
                      race |
                    Black  |  -.1237269   .0126612    -9.77   0.000    -.1485424   -.0989114
                    Other  |   .0965773   .0613496     1.57   0.115    -.0236657    .2168203
                           |
                       age |   .0594573   .0041032    14.49   0.000     .0514151    .0674995
                           |
               c.age#c.age |  -.0006835   .0000688    -9.94   0.000    -.0008182   -.0005487
                           |
                     _cons |   .5761164   .0586669     9.82   0.000     .4611314    .6911015
              -------------+----------------------------------------------------------------
                   sigma_u |  .36094993
                   sigma_e |  .30245467
                       rho |   .5874941   (fraction of variance due to u_i)
              ------------------------------------------------------------------------------
              
              . mat list e(b)
              
              e(b)[1,6]
                          1b.          2.          3.                  c.age#            
                        race        race        race         age       c.age       _cons
              y1           0  -.12372688   .09657728   .05945731  -.00068348   .57611643
              
              . test 1b.race=3.race
              
               ( 1)  1b.race - 3.race = 0
              
                         chi2(  1) =    2.48
                       Prob > chi2 =    0.1154
              
              . test 1b.race=3.race
              
               ( 1)  1b.race - 3.race = 0
              
                         chi2(  1) =    2.48
                       Prob > chi2 =    0.1154
              
              . predict fitted, xb
              (24 missing values generated)
              
              . g sq_fitted=fitted^2
              (24 missing values generated)
              
              . xtreg ln_wage fitted sq_fitted , re rob
              
              Random-effects GLS regression                   Number of obs     =     28,510
              Group variable: idcode                          Number of groups  =      4,710
              
              R-squared:                                      Obs per group:
                   Within  = 0.1092                                         min =          1
                   Between = 0.1182                                         avg =        6.1
                   Overall = 0.1056                                         max =         15
              
                                                              Wald chi2(2)      =    1377.13
              corr(u_i, X) = 0 (assumed)                      Prob > chi2       =     0.0000
              
                                           (Std. err. adjusted for 4,710 clusters in idcode)
              ------------------------------------------------------------------------------
                           |               Robust
                   ln_wage | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
              -------------+----------------------------------------------------------------
                    fitted |   2.372522   .4962628     4.78   0.000     1.399865    3.345179
                 sq_fitted |  -.4193389   .1527897    -2.74   0.006    -.7188013   -.1198766
                     _cons |  -1.114001   .4010496    -2.78   0.005    -1.900044   -.3279582
              -------------+----------------------------------------------------------------
                   sigma_u |  .36252299
                   sigma_e |  .30238279
                       rho |  .58971525   (fraction of variance due to u_i)
              ------------------------------------------------------------------------------
              
              . test sq_fitted
              
               ( 1)  sq_fitted = 0
              
                         chi2(  1) =    7.53
                       Prob > chi2 =    0.0061
              
              .
              Kind regards,
              Carlo
              (Stata 19.0)

              Comment


              • #8
                Dear Carlo,

                Thanks for the clear and detailed comments.
                Concerning your previous comment 1, I have already ran -xttest0- after -xtreg,re-; That showed: Prob>chibar2 = 0.0000 and re was chosen over OLS.
                Regarding the use of “test” on WELFARE3 variable levels (comment 3), I am unsure if I did the right procedure and what the results indicate.
                Code:
                . xtreg SPHBVB TEMPEMPL PARTIME SELFEMPLnoEMPLs UNEMPLOYMENT i.WELFARE3, re vce(robust)
                
                Random-effects GLS regression                   Number of obs     =        255
                Group variable: id                              Number of groups  =         17
                
                R-squared:                                      Obs per group:
                     Within  = 0.0517                                         min =         15
                     Between = 0.5942                                         avg =       15.0
                     Overall = 0.4074                                         max =         15
                
                                                                Wald chi2(6)      =      71.40
                corr(u_i, X) = 0 (assumed)                      Prob > chi2       =     0.0000
                
                                                       (Std. err. adjusted for 17 clusters in id)
                ---------------------------------------------------------------------------------
                                |               Robust
                         SPHBVB | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
                ----------------+----------------------------------------------------------------
                       TEMPEMPL |   -.042982   .0390567    -1.10   0.271    -.1195317    .0335677
                        PARTIME |   .0238397    .028361     0.84   0.401    -.0317468    .0794261
                SELFEMPLnoEMPLs |  -.2973076   .0842092    -3.53   0.000    -.4623547   -.1322606
                   UNEMPLOYMENT |    .071698    .042335     1.69   0.090    -.0112771    .1546731
                                |
                       WELFARE3 |
                             2  |  -3.000322   1.177964    -2.55   0.011     -5.30909   -.6915548
                             3  |   2.199055   2.146342     1.02   0.306    -2.007699    6.405808
                                |
                          _cons |    8.49148   1.649644     5.15   0.000     5.258237    11.72472
                ----------------+----------------------------------------------------------------
                        sigma_u |  2.3262191
                        sigma_e |  2.2571891
                            rho |  .51505743   (fraction of variance due to u_i)
                ---------------------------------------------------------------------------------
                
                . test i2.WELFARE3
                
                 ( 1)  2.WELFARE3 = 0
                
                           chi2(  1) =    6.49
                         Prob > chi2 =    0.0109
                
                . test i3.WELFARE3
                
                 ( 1)  3.WELFARE3 = 0
                
                           chi2(  1) =    1.05
                         Prob > chi2 =    0.3056
                Also, I followed step by step-without any serious mistake, hopefully, and applied the test on the basis of your Stata example. The last finding -test- outcome ( Prob > chi2 = 0.3920), points towards an acceptable model (?).

                Code:
                . test 1b.WELFARE3=3.WELFARE3
                
                 ( 1)  1b.WELFARE3 - 3.WELFARE3 = 0
                
                           chi2(  1) =    1.05
                         Prob > chi2 =    0.3056
                
                . test 1b.WELFARE3=3.WELFARE3
                
                 ( 1)  1b.WELFARE3 - 3.WELFARE3 = 0
                
                           chi2(  1) =    1.05
                         Prob > chi2 =    0.3056
                
                . 
                . 
                . predict fitted, xb
                
                . g sq_fitted=fitted^2
                
                . xtreg SPHBVB fitted sq_fitted , re rob
                
                Random-effects GLS regression                   Number of obs     =        255
                Group variable: id                              Number of groups  =         17
                
                R-squared:                                      Obs per group:
                     Within  = 0.0478                                         min =         15
                     Between = 0.6297                                         avg =       15.0
                     Overall = 0.4296                                         max =         15
                
                                                                Wald chi2(2)      =      42.38
                corr(u_i, X) = 0 (assumed)                      Prob > chi2       =     0.0000
                
                                                    (Std. err. adjusted for 17 clusters in id)
                ------------------------------------------------------------------------------
                             |               Robust
                      SPHBVB | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
                -------------+----------------------------------------------------------------
                      fitted |   .5476712     .55013     1.00   0.319    -.5305639    1.625906
                   sq_fitted |   .0498526   .0582377     0.86   0.392    -.0642912    .1639963
                       _cons |   .7105321   1.175404     0.60   0.546    -1.593217    3.014281
                -------------+----------------------------------------------------------------
                     sigma_u |  1.6868596
                     sigma_e |  2.2484105
                         rho |  .36015074   (fraction of variance due to u_i)
                ------------------------------------------------------------------------------
                
                . test sq_fitted
                
                 ( 1)  sq_fitted = 0
                
                           chi2(  1) =    0.73
                         Prob > chi2 =    0.3920
                
                .
                Two additional questions:
                1. If the WELFARE3 variable, according to the “test” results, is not appropriate for the model, may I then present the differences based on a pooled regression model? Thus, presenting both the random effect and in addition pooled regression for the purpose of presenting differences according to WELFARE3?
                2. May I attempt to estimate a mixed model as a better alternative?

                Kind regards,
                Maria

                Comment


                • #9
                  Maria:
                  1) your -test- (as expected) repeat (with one more decimal digit) the p-values already reported in the regression outcome table;
                  2) as per the results of auxiloiary regression, your model shows no evidence of misspecification;
                  3) I would stick with -xtreg,re- and forget your queries # 1) and 2).
                  Kind regards,
                  Carlo
                  (Stata 19.0)

                  Comment


                  • #10
                    Dear Carlo,

                    Thank you very much for your advice. Your support was immensely valuable.

                    Kind regards,
                    Maria

                    Comment

                    Working...
                    X