Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Testing Assumptions for Multilevel linear Regression

    Hello,
    I am using StataMP 18.0 on Windows 10 and want to test the assumptions of my multilevel linear regression.

    I have used the mixed command as follows:

    Code:
    mixed y x1 x2 [pw = weight] || L2:, mle
    I have found the following assumptions for MLM:
    1. The model is correctly specified (i.e., all the predictors associated with the outcome and relevant random effects are included)
      1. I would do this with a self made link test as specified here https://www.statalist.org/forums/for...el-assumptions
    2. The functional form is correct (e.g., the relationship between the predictors and outcome is linear if using a linear model);
      1. This will be assessed by theory
    3. Level-1 residuals are independent and normally distributed;
      1. I know how to assess the normal distribution of residuals with a qq plot (using predict, res & qnorm/pnorm), but I do not understand how to extract level 1 residuals only
    4. Level-2 residuals are independent and multivariate normally distributed;
      1. I would do this the following way:

        Code:
        predict l2res, res relevel(L2)
        		qnorm l2res
        		pnorm l2res
        is that correct?
    5. Residuals at level-1 and level-2 are unrelated;
      1. I assume I could test this by correlating the L1 and L2 residuals but for that again, I would need to know how to extract Level1 residuals independently
    6. Predictors at one level are not related to errors at another level (homoscedasticity).
      1. I have no idea how I should test this. I found the article by Antonakis et al. (2021) "On ignoring the random effects assumption in multilevel models:
        Review, critique, and recommendations" but I have trouble fully understanding it and whether this is actually testing what I want to be testing.
    Any help with any of the assumptions would be very much appreciated!

  • #2
    To calculate the level-1 residuals:
    Code:
    // DO THIS AFTER YOU HAVE CALCULATED L2RES
    predict xb, xb
    gen l1res = y - xb - l2res

    Comment


    • #3
      I wonder if this is something that is different in newer versions of Stata (I have v16), but based on playing around with the predicted residuals, I see no difference between the following methods of obtaining residuals:
      Code:
      webuse pig, clear 
      mixed weight week || id: , cov(un)
      predict l2res, residuals
      predict l2res2, residuals relevel(id)
      * Fixed + random effect prediction
      predict fitted, fitted 
      *Observed - fitted
      gen obs_fit = weight - fitted
      list id weight l2res l2res2 obs_fit in 1/12, noobs sep(12)
      
        +-------------------------------------------------+
        | id   weight       l2res      l2res2     obs_fit |
        |-------------------------------------------------|
        |  1       24    .1175955    .1175955    .1175957 |
        |  1       32      1.9077      1.9077      1.9077 |
        |  1       39    2.697804    2.697804    2.697803 |
        |  1     42.5    -.012092    -.012092   -.0120926 |
        |  1       48   -.7219878   -.7219878   -.7219887 |
        |  1     54.5   -.4318836   -.4318836   -.4318848 |
        |  1       61   -.1417795   -.1417795   -.1417809 |
        |  1       65   -2.351675   -2.351675   -2.351677 |
        |  1       72   -1.561571   -1.561571   -1.561569 |
        |  2     22.5   -3.964211   -3.964211   -3.964211 |
        |  2     30.5   -2.174107   -2.174107   -2.174107 |
        |  2     40.5    1.615997    1.615997    1.615997 |
        +-------------------------------------------------+
      I interpret this as evidence that predict , residuals is ignoring the relevel() argument and is providing only the level 1 residuals.

      Then, taking Clyde Schechter's code, we get the following added on:
      Code:
      predict xb, xb
      gen l1res = weight - xb - l2res
      *EB prediction for comparison
      predict reffect, reffects
      
        +-------------------------------------------------+
        | id   weight       l2res       l1res    reffect1 |
        |-------------------------------------------------|
        |  1       24    .1175955   -1.683105   -1.683105 |
        |  1       32      1.9077   -1.683106   -1.683105 |
        |  1       39    2.697804   -1.683106   -1.683105 |
        |  1     42.5    -.012092   -1.683106   -1.683105 |
        |  1       48   -.7219878   -1.683106   -1.683105 |
        |  1     54.5   -.4318836   -1.683107   -1.683105 |
        |  1       61   -.1417795   -1.683103   -1.683105 |
        |  1       65   -2.351675   -1.683107   -1.683105 |
        |  1       72   -1.561571   -1.683104   -1.683105 |
        |  2     22.5   -3.964211    .8987012    .8987018 |
        |  2     30.5   -2.174107     .898701    .8987018 |
        |  2     40.5    1.615997    .8987007    .8987018 |
        +-------------------------------------------------+
      So because residual doesn't give us true level 2 (cluster) residuals, and instead only provides level 1 (observations) residuals, Clyde's code ends up giving us the empirical Bayes prediction of the random effect. Does this mean that the level 2 residual of interest is actually the empirical Bayes prediction? That was my initial thought when I read the question, but perhaps I'm missing something.

      I would also love an explanation about what is going on with predict, residuals relevel().

      Comment


      • #4
        Thank you for your help Clyde Schechter and Erik Ruzek !

        Code:
        predict l2res2, reeffects
        does indeed also give me the same estimation as Clydes way of calculating L1 residuals. I've been reading a bit about BLUPs (which as far as I understand are calculated by using predict, reeffects) but I am unable to understand if they are actually the same as L1 residuals or calculating something else?

        In addition, for me there is also no difference in predictions for
        Code:
        predict l2res, residuals relevel(L2)
        and
        Code:
        predict l2res, residuals
        Here is what I've done and the output I got
        Code:
        predict l2res, residuals relevel(Region_kat)
        
        predict l2res2, res
        
        predict predicted, xb
        
        predict blup, reffects
        
        gen l1res = F01 - predicted - l2res
        
        list lfd F01 l2res l2res2 blup l1res in 1/12, noobs sep(12)
        
          +--------------------------------------------------------------------------------+
          |      lfd                   F01       l2res      l2res2        blup       l1res |
          |--------------------------------------------------------------------------------|
          | 10000004          5. Zufrieden   -,1256996   -,1256996    ,1873648    ,1873647 |
          | 10000007          5. Zufrieden   -,3762915   -,3762915    ,4293368    ,4293368 |
          | 10000008          5. Zufrieden    ,5353866    ,5353866   -,4506869   -,4506869 |
          | 10000014     4. Eher zufrieden    -1,15214    -1,15214     ,131908    ,1319079 |
          | 10000043     6. Sehr zufrieden    ,6140783    ,6140783    ,4025768    ,4025766 |
          | 10000048          5. Zufrieden   -,3397757   -,3397757    ,4025768    ,4025766 |
          | 10000076     4. Eher zufrieden   -1,348396   -1,348396    ,4025768    ,4025768 |
          | 10000095   3. Eher unzufrieden   -1,339205   -1,339205   -,6535348   -,6535349 |
          | 10000102   1. Sehr unzufrieden   -4,373418   -4,373418    ,4025768    ,4025769 |
          | 10000136     4. Eher zufrieden   -,8631215   -,8631215   -,0994606   -,0994606 |
          | 10000137     4. Eher zufrieden   -,2085489   -,2085489   -,6535348   -,6535345 |
          | 10000169     6. Sehr zufrieden    ,6701893    ,6701893    ,3370711     ,337071 |
          +--------------------------------------------------------------------------------+

        Comment

        Working...
        X