Testing Assumptions for Multilevel linear Regression

Jennifer Hauschildt

Join Date: Aug 2024

Posts: 7
#1

Testing Assumptions for Multilevel linear Regression

08 Aug 2024, 03:22

Hello,
I am using StataMP 18.0 on Windows 10 and want to test the assumptions of my multilevel linear regression.

I have used the mixed command as follows:

Code:

mixed y x1 x2 [pw = weight] || L2:, mle

I have found the following assumptions for MLM:
The model is correctly specified (i.e., all the predictors associated with the outcome and relevant random effects are included)
I would do this with a self made link test as specified here https://www.statalist.org/forums/for...el-assumptions

The functional form is correct (e.g., the relationship between the predictors and outcome is linear if using a linear model);
This will be assessed by theory

Level-1 residuals are independent and normally distributed;
I know how to assess the normal distribution of residuals with a qq plot (using predict, res & qnorm/pnorm), but I do not understand how to extract level 1 residuals only

Level-2 residuals are independent and multivariate normally distributed;
I would do this the following way:

Code:

predict l2res, res relevel(L2) qnorm l2res pnorm l2res

is that correct?

Residuals at level-1 and level-2 are unrelated;
I assume I could test this by correlating the L1 and L2 residuals but for that again, I would need to know how to extract Level1 residuals independently

Predictors at one level are not related to errors at another level (homoscedasticity).
I have no idea how I should test this. I found the article by Antonakis et al. (2021) "On ignoring the random effects assumption in multilevel models:
Review, critique, and recommendations" but I have trouble fully understanding it and whether this is actually testing what I want to be testing.

Any help with any of the assumptions would be very much appreciated!
Tags: assumptions, Level 1 residuals, mlm, MLM assumptions, multilevel regression
Clyde Schechter

Join Date: Apr 2014

Posts: 30117
#2

08 Aug 2024, 10:04

To calculate the level-1 residuals:

Code:

// DO THIS AFTER YOU HAVE CALCULATED L2RES predict xb, xb gen l1res = y - xb - l2res
1 like
Comment

Erik Ruzek

Join Date: Oct 2017
Posts: 430

09 Aug 2024, 14:57

I wonder if this is something that is different in newer versions of Stata (I have v16), but based on playing around with the predicted residuals, I see no difference between the following methods of obtaining residuals:

Code:

webuse pig, clear 
mixed weight week || id: , cov(un)
predict l2res, residuals
predict l2res2, residuals relevel(id)
* Fixed + random effect prediction
predict fitted, fitted 
*Observed - fitted
gen obs_fit = weight - fitted
list id weight l2res l2res2 obs_fit in 1/12, noobs sep(12)

  +-------------------------------------------------+
  | id   weight       l2res      l2res2     obs_fit |
  |-------------------------------------------------|
  |  1       24    .1175955    .1175955    .1175957 |
  |  1       32      1.9077      1.9077      1.9077 |
  |  1       39    2.697804    2.697804    2.697803 |
  |  1     42.5    -.012092    -.012092   -.0120926 |
  |  1       48   -.7219878   -.7219878   -.7219887 |
  |  1     54.5   -.4318836   -.4318836   -.4318848 |
  |  1       61   -.1417795   -.1417795   -.1417809 |
  |  1       65   -2.351675   -2.351675   -2.351677 |
  |  1       72   -1.561571   -1.561571   -1.561569 |
  |  2     22.5   -3.964211   -3.964211   -3.964211 |
  |  2     30.5   -2.174107   -2.174107   -2.174107 |
  |  2     40.5    1.615997    1.615997    1.615997 |
  +-------------------------------------------------+

I interpret this as evidence that predict , residuals is ignoring the relevel() argument and is providing only the level 1 residuals.

Then, taking Clyde Schechter's code, we get the following added on:

Code:

predict xb, xb
gen l1res = weight - xb - l2res
*EB prediction for comparison
predict reffect, reffects

  +-------------------------------------------------+
  | id   weight       l2res       l1res    reffect1 |
  |-------------------------------------------------|
  |  1       24    .1175955   -1.683105   -1.683105 |
  |  1       32      1.9077   -1.683106   -1.683105 |
  |  1       39    2.697804   -1.683106   -1.683105 |
  |  1     42.5    -.012092   -1.683106   -1.683105 |
  |  1       48   -.7219878   -1.683106   -1.683105 |
  |  1     54.5   -.4318836   -1.683107   -1.683105 |
  |  1       61   -.1417795   -1.683103   -1.683105 |
  |  1       65   -2.351675   -1.683107   -1.683105 |
  |  1       72   -1.561571   -1.683104   -1.683105 |
  |  2     22.5   -3.964211    .8987012    .8987018 |
  |  2     30.5   -2.174107     .898701    .8987018 |
  |  2     40.5    1.615997    .8987007    .8987018 |
  +-------------------------------------------------+

So because residual doesn't give us true level 2 (cluster) residuals, and instead only provides level 1 (observations) residuals, Clyde's code ends up giving us the empirical Bayes prediction of the random effect. Does this mean that the level 2 residual of interest is actually the empirical Bayes prediction? That was my initial thought when I read the question, but perhaps I'm missing something.

I would also love an explanation about what is going on with predict, residuals relevel().

Comment

Jennifer Hauschildt

Join Date: Aug 2024
Posts: 7

13 Aug 2024, 05:49

Thank you for your help Clyde Schechter and Erik Ruzek !

Code:

predict l2res2, reeffects

does indeed also give me the same estimation as Clydes way of calculating L1 residuals. I've been reading a bit about BLUPs (which as far as I understand are calculated by using predict, reeffects) but I am unable to understand if they are actually the same as L1 residuals or calculating something else?

In addition, for me there is also no difference in predictions for

Code:

predict l2res, residuals relevel(L2)

and

Code:

predict l2res, residuals

Here is what I've done and the output I got

Code:

predict l2res, residuals relevel(Region_kat)

predict l2res2, res

predict predicted, xb

predict blup, reffects

gen l1res = F01 - predicted - l2res

list lfd F01 l2res l2res2 blup l1res in 1/12, noobs sep(12)

  +--------------------------------------------------------------------------------+
  |      lfd                   F01       l2res      l2res2        blup       l1res |
  |--------------------------------------------------------------------------------|
  | 10000004          5. Zufrieden   -,1256996   -,1256996    ,1873648    ,1873647 |
  | 10000007          5. Zufrieden   -,3762915   -,3762915    ,4293368    ,4293368 |
  | 10000008          5. Zufrieden    ,5353866    ,5353866   -,4506869   -,4506869 |
  | 10000014     4. Eher zufrieden    -1,15214    -1,15214     ,131908    ,1319079 |
  | 10000043     6. Sehr zufrieden    ,6140783    ,6140783    ,4025768    ,4025766 |
  | 10000048          5. Zufrieden   -,3397757   -,3397757    ,4025768    ,4025766 |
  | 10000076     4. Eher zufrieden   -1,348396   -1,348396    ,4025768    ,4025768 |
  | 10000095   3. Eher unzufrieden   -1,339205   -1,339205   -,6535348   -,6535349 |
  | 10000102   1. Sehr unzufrieden   -4,373418   -4,373418    ,4025768    ,4025769 |
  | 10000136     4. Eher zufrieden   -,8631215   -,8631215   -,0994606   -,0994606 |
  | 10000137     4. Eher zufrieden   -,2085489   -,2085489   -,6535348   -,6535345 |
  | 10000169     6. Sehr zufrieden    ,6701893    ,6701893    ,3370711     ,337071 |
  +--------------------------------------------------------------------------------+

Announcement