testing mixed model assumptions

Hanna Lanzinger

Join Date: Jun 2019

Posts: 8
#1

testing mixed model assumptions

27 Jun 2019, 01:32

Hi everyone,

I'm using mixed models the first time and I've got problems checking the model assumptions using Stata.
I've tried to check them in the same way like I would do for linear regression, but the commands do not work (estat hettest, estat vif..)

Are there any other, special commands for mixed models to test for Multicollinearity, Heteroscedasticity etc?

Thanks for help in advance!!

Best,
Hanna

Last edited by Hanna Lanzinger; 27 Jun 2019, 02:28.
Tags: None

Carlo Lazzaro

Join Date: Apr 2014
Posts: 17712

27 Jun 2019, 02:08

Hanna:
welcome to this forum.
The first step is taking a look at -help mixed postestimation-.
-mixed- share the same condition of -xtreg-: they both do not support mostly of the -regress postestimation- commands (by the way: your code between brackets are not legal at all event after -regress-!. As you may already know, Stata is very demanding when it comes to spelling out command codes!).
That said, you can have an idea of a possible quasi-extreme multicollinearity issue via -estat vce, corr-.
Heteroskedasticity can be assessed via visual inspection.
The so called omitted variables bias test (that in fact detect non-linearity between regressand and predictors) can be investigated via something similar to -linktest- (see -help linktest-):

Code:

. use http://www.stata-press.com/data/r15/pig.dta
(Longitudinal analysis of pig weights)

. mixed weight week || id:

Performing EM optimization:

Performing gradient-based optimization:

Iteration 0:   log likelihood = -1014.9268 
Iteration 1:   log likelihood = -1014.9268 

Computing standard errors:

Mixed-effects ML regression                     Number of obs     =        432
Group variable: id                              Number of groups  =         48

                                                Obs per group:
                                                              min =          9
                                                              avg =        9.0
                                                              max =          9

                                                Wald chi2(1)      =   25337.49
Log likelihood = -1014.9268                     Prob > chi2       =     0.0000

------------------------------------------------------------------------------
      weight |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        week |   6.209896   .0390124   159.18   0.000     6.133433    6.286359
       _cons |   19.35561   .5974059    32.40   0.000     18.18472    20.52651
------------------------------------------------------------------------------

------------------------------------------------------------------------------
  Random-effects Parameters  |   Estimate   Std. Err.     [95% Conf. Interval]
-----------------------------+------------------------------------------------
id: Identity                 |
                  var(_cons) |   14.81751   3.124226      9.801716    22.40002
-----------------------------+------------------------------------------------
               var(Residual) |   4.383264   .3163348      3.805112     5.04926
------------------------------------------------------------------------------
LR test vs. linear model: chibar2(01) = 472.65        Prob >= chibar2 = 0.0000

. predict fitted_m, xb

. g sq_fitted_m= fitted_m^2

. mixed weight fitted_m sq_fitted_m || id:

Performing EM optimization:

Performing gradient-based optimization:

Iteration 0:   log likelihood = -1014.5524 
Iteration 1:   log likelihood = -1014.5524 

Computing standard errors:

Mixed-effects ML regression                     Number of obs     =        432
Group variable: id                              Number of groups  =         48

                                                Obs per group:
                                                              min =          9
                                                              avg =        9.0
                                                              max =          9

                                                Wald chi2(2)      =   25387.69
Log likelihood = -1014.5524                     Prob > chi2       =     0.0000

------------------------------------------------------------------------------
      weight |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
    fitted_m |   1.038931   .0454051    22.88   0.000     .9499385    1.127923
 sq_fitted_m |  -.0003862   .0004461    -0.87   0.387    -.0012605    .0004881
       _cons |  -.8818749   1.206892    -0.73   0.465    -3.247339    1.483589
------------------------------------------------------------------------------

------------------------------------------------------------------------------
  Random-effects Parameters  |   Estimate   Std. Err.     [95% Conf. Interval]
-----------------------------+------------------------------------------------
id: Identity                 |
                  var(_cons) |   14.81846   3.124225      9.802604    22.40086
-----------------------------+------------------------------------------------
               var(Residual) |   4.374725   .3157186      3.797699    5.039424
------------------------------------------------------------------------------
LR test vs. linear model: chibar2(01) = 473.23        Prob >= chibar2 = 0.0000

. test sq_fitted_m=0

 ( 1)  [weight]sq_fitted_m = 0

           chi2(  1) =    0.75
         Prob > chi2 =    0.3866

.
*As the -test- outcome does not reach statistical significance, there's no evidence that the model is misspecified*

Kind regards,
Carlo
(Stata 19.0)

Comment

Hanna Lanzinger

Join Date: Jun 2019

Posts: 8
#3

27 Jun 2019, 23:45

Hello Carlo,

thank you very much for your fast response and your help!!

Now I'm having another problem computing pseudo R-squared for my models, in other posts I found a code but I'm not quite sure if its correct the way I do it:
(I'm performing a cross-classified model with country and sectors (variable: sic_2) as clusters)

Code:

Code:

*Regular model mixed management pdi_100 i.ownership firm_size_1000 firm_size_sq_1000000 || _all: R.sic_2 || _all: R.country, scalar llu = e(ll) *Constant model only mixed management pdi_100 || _all: R.sic_2 || _all: R.country, scalar llr = e(ll) *computation of the pseudo r2 scalar pr2 = 1- llu/llr di "Pseudo-r2:" pr2

Another question that came to my mind is if it is possible to combine mixed models with clustered standard errors?

Thanks again for help in advance!

Best,
Hanna
Comment
Hanna Lanzinger

Join Date: Jun 2019

Posts: 8
#4

28 Jun 2019, 02:15

I figured out that in order to include robust variances which are clustered at the highest level in a multilevel model I have to put it in the following way:

Code:

*Model 1: Only clustering at country level taken into account mixed management i.ownership firm_size_1000 firm_size_sq_1000000 || country:, vce(robust)

In my second model I use a cross-classified model with countries and sectors (variable: sic_2) as clusters. If I try to do it in the same way like I did for my first model, the following error occurs:

Code:

*Model 2: Without PDI, clustering at country and at sector level mixed management i.ownership firm_size_1000 firm_size_sq_1000000 || _all: R.sic_2 || _all: r.country, vce(robust) robust variances and pweights not supported when highest-level group encompasses all the data r(498);

Is it simply not necessary to put clustered standard errors in a cross classified model?

Best,
Hanna
Comment

Announcement