How to assess heteroskedasticity with longitudinal data (linear mixed effects model)

Wes Beliski

Join Date: Apr 2023

Posts: 7
#1

How to assess heteroskedasticity with longitudinal data (linear mixed effects model)

02 May 2023, 07:12

Hello,

I am currently analyzing longitudinal data. I have a continuous outcome variable that was measured at 3 different time points. Furthermore, I have both categorical and continuous predictor variables which are all fixed variables.

Now I want to assess heteroskedasticity by making a residuals vs. fitted plot, but I can't figure out how to do this in Stata.

Can anyone help me with this?

Thanks in advance!
Tags: None

Carlo Lazzaro

Join Date: Apr 2014
Posts: 17714

02 May 2023, 07:45

Wes:
I do hope that what follows can help:

Code:

. use https://www.stata-press.com/data/r17/pig.dta
(Longitudinal analysis of pig weights)

. mixed weight i.week ||id:

Performing EM optimization ...

Performing gradient-based optimization: 
Iteration 0:   log likelihood = -1007.0675  
Iteration 1:   log likelihood = -1007.0675  

Computing standard errors ...

Mixed-effects ML regression                     Number of obs     =        432
Group variable: id                              Number of groups  =         48
                                                Obs per group:
                                                              min =          9
                                                              avg =        9.0
                                                              max =          9
                                                Wald chi2(8)      =   26412.22
Log likelihood = -1007.0675                     Prob > chi2       =     0.0000

------------------------------------------------------------------------------
      weight | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
        week |
          2  |   6.760417   .4187014    16.15   0.000     5.939777    7.581056
          3  |   13.84375   .4187014    33.06   0.000     13.02311    14.66439
          4  |     19.375   .4187014    46.27   0.000     18.55436    20.19564
          5  |   25.13542   .4187014    60.03   0.000     24.31478    25.95606
          6  |   31.42708   .4187014    75.06   0.000     30.60644    32.24772
          7  |    37.4375   .4187014    89.41   0.000     36.61686    38.25814
          8  |   44.28125   .4187014   105.76   0.000     43.46061    45.10189
          9  |   50.19792   .4187014   119.89   0.000     49.37728    51.01856
             |
       _cons |   25.02083   .6298893    39.72   0.000     23.78627    26.25539
------------------------------------------------------------------------------

------------------------------------------------------------------------------
  Random-effects parameters  |   Estimate   Std. err.     [95% conf. interval]
-----------------------------+------------------------------------------------
id: Identity                 |
                  var(_cons) |   14.83704    3.12421      9.819998     22.4173
-----------------------------+------------------------------------------------
               var(Residual) |   4.207462   .3036474      3.652498    4.846747
------------------------------------------------------------------------------
LR test vs. linear model: chibar2(01) = 484.84        Prob >= chibar2 = 0.0000

. predict fitted, fitted

. predict response_minus_fitted_values, residuals

. twoway (scatter response_minus_fitted_values week)

.

Kind regards,
Carlo
(Stata 19.0)

Comment

Rich Goldstein

Join Date: Mar 2014

Posts: 4466
#3

02 May 2023, 07:46

it is not clear what model you are using for estimation but some models provide a post-estimation command -rvfplot-; see

Code:

h rvfplot

but also see

Code:

h command postestimation

where you replace command with the actual command name you are using

crossed the #2

Last edited by Rich Goldstein; 02 May 2023, 07:48.
2 likes
Comment
Wes Beliski

Join Date: Apr 2023

Posts: 7
#4

02 May 2023, 10:06

Dear Carlo and Rich,

Thank you both for your help.

In response to Carlo: if I use your code, the last variable between the brackets is put on the x-axis. To illustrate that: I have 3 independent variables, all categorical: variable A, B and C. If I follow your code and end it like this: "twoway (scatter response_minus_fitted_values A B C)", then this is my output:

I am not sure if this is the output that I want, because, for example, variable A has 4 categories.

I did find the following code on this forum:

mixed [dependent variable] [indep variable A] [indep variable B] [indep variable C] || id:
predict double xb, xb
predict double re, residuals
graph twoway scatter re xb, mcolor(black) msize(small) ylabel( , angle (horizontal) nogrid)

This code does produce something that looks more like a scatterplot:

However, I am not advanced enough to fully understand the code. Do you know whether this is also a valid command?

Thanks in advance!
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17714
#5

02 May 2023, 10:10

Wes,
yes it is.
My toy-example was overly trivial (one predictor only).

Last edited by Carlo Lazzaro; 02 May 2023, 10:45.

Kind regards,
Carlo
(Stata 19.0)
1 like
Comment
Wes Beliski

Join Date: Apr 2023

Posts: 7
#6

02 May 2023, 10:11

Thank you so much, Carlo!
Comment
Wes Beliski

Join Date: Apr 2023

Posts: 7
#7

05 May 2023, 04:37

Dear Carlo (or anyone else who can help),

I have a question that builds on the previous questions that I posted here.

When using the code I previously posted, the x-axis shows the "linear prediction, fixed portion":

mixed [dependent variable] [indep variable A] [indep variable B] [indep variable C] || id:
predict double xb, xb
predict double re, residuals
graph twoway scatter re xb, mcolor(black) msize(small) ylabel( , angle (horizontal) nogrid)

However, when I look in a Stata manual, it says that the xb command calculates the linear prediction xβ based only on the estimated fixed effects (coefficients) in the model, and that it does not take into account contributions based on predicted random effects. However, since I am working with longitudinal data, and I am correcting for id, shouldn't my x-axis show the fitted values instead? Or is the previous command that I used adequate enough?

Thanks in advance for helping me understand.
Comment

Announcement

How to assess heteroskedasticity with longitudinal data (linear mixed effects model)

Comment

Comment

Comment

Comment

Comment

Comment