Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Residual slope and Residual intercept

    Hi Folks,

    I have a three level mixed model being analyzed in Stata 14.1, where observations are recorded at level 2 multiple times over the study period. The level 1 term is a region. The residual intercepts seem to be important based on the first result [with sizable var(_cons)] terms. However, in the second model, where I include a relevant random slope in the 2nd level, the var(_cons) appears insignificant (see below). How do I explain this? Does it mean the difference in timelag at each level 2 explains all of the random variation for each level 2 observation? I also see that the level 1 residual constant is diminished quite a bit too when I add that random slope. I am not sure what to make of it. Thanks!

    Code:
    . eststo a1b: mixed dtwch timelag perc_drought ///
    > riv_km2 popden_c100 alt_avg1000 if absdtwchtime<400 ///
    > ||basinid:  ||id2:, mle
    
    Performing EM optimization:
    
    Performing gradient-based optimization:
    
    Iteration 0:   log likelihood = -192190.45  
    Iteration 1:   log likelihood =  -192189.6  
    Iteration 2:   log likelihood =  -192189.6  
    
    Computing standard errors:
    
    Mixed-effects ML regression                     Number of obs     =     46,334
    
    -------------------------------------------------------------
                    |     No. of       Observations per Group
     Group Variable |     Groups    Minimum    Average    Maximum
    ----------------+--------------------------------------------
            basinid |         26        120    1,782.1      6,175
                id2 |      3,791          1       12.2         72
    -------------------------------------------------------------
    
                                                    Wald chi2(5)      =     164.35
    Log likelihood =  -192189.6                     Prob > chi2       =     0.0000
    
    ------------------------------------------------------------------------------
           dtwch |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
         timelag |   .1940898   .0235906     8.23   0.000      .147853    .2403266
    perc_drought |   .0168127   .0027138     6.20   0.000     .0114938    .0221317
         riv_km2 |   .0238919   .0098938     2.41   0.016     .0045003    .0432835
     popden_c100 |  -.0540279   .0694508    -0.78   0.437     -.190149    .0820932
     alt_avg1000 |  -.0443424   .3004287    -0.15   0.883    -.6331718     .544487
           _cons |  -1.456768    1.45866    -1.00   0.318    -4.315689    1.402152
    ------------------------------------------------------------------------------
    
    ------------------------------------------------------------------------------
      Random-effects Parameters  |   Estimate   Std. Err.     [95% Conf. Interval]
    -----------------------------+------------------------------------------------
    basinid: Identity            |
                      var(_cons) |   2.456454   1.407536      .7990503     7.55167
    -----------------------------+------------------------------------------------
    id2: Identity                |
                      var(_cons) |   196.8926   11.71573      175.2186    221.2477
    -----------------------------+------------------------------------------------
                   var(Residual) |   197.1276   1.542398      194.1276    200.1739
    ------------------------------------------------------------------------------
    LR test vs. linear model: chi2(2) = 0.00                  Prob > chi2 = 1.0000
    Code:
    . eststo a1a: mixed dtwch timelag perc_drought ///
    > riv_km2 popden_c100 alt_avg1000 if absdtwchtime<400 ///
    > ||basinid:  ||id2:timelag,  mle
    
    Performing EM optimization:
    
    Performing gradient-based optimization:
    
    Iteration 0:   log likelihood = -188998.98  
    Iteration 1:   log likelihood = -188876.63  
    Iteration 2:   log likelihood = -188875.85  
    Iteration 3:   log likelihood = -188875.85  
    
    Computing standard errors:
    
    Mixed-effects ML regression                     Number of obs     =     46,334
    
    -------------------------------------------------------------
                    |     No. of       Observations per Group
     Group Variable |     Groups    Minimum    Average    Maximum
    ----------------+--------------------------------------------
            basinid |         26        120    1,782.1      6,175
                id2 |      3,791          1       12.2         72
    -------------------------------------------------------------
    
                                                    Wald chi2(5)      =      70.57
    Log likelihood = -188875.85                     Prob > chi2       =     0.0000
    
    ------------------------------------------------------------------------------
           dtwch |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
         timelag |   .2846827   .0531741     5.35   0.000     .1804633    .3889021
    perc_drought |   .0128998   .0026217     4.92   0.000     .0077614    .0180382
         riv_km2 |   .0022632    .003836     0.59   0.555    -.0052552    .0097815
     popden_c100 |  -.0242854   .0279246    -0.87   0.384    -.0790167    .0304458
     alt_avg1000 |   .0094631   .1238841     0.08   0.939    -.2333454    .2522715
           _cons |   -.147752   .5954296    -0.25   0.804    -1.314773    1.019268
    ------------------------------------------------------------------------------
    
    ------------------------------------------------------------------------------
      Random-effects Parameters  |   Estimate   Std. Err.     [95% Conf. Interval]
    -----------------------------+------------------------------------------------
    basinid: Identity            |
                      var(_cons) |   .3887261   .3092099      .0817637    1.848106
    -----------------------------+------------------------------------------------
    id2: Independent             |
                    var(timelag) |    5.54817   .2302261      5.114797    6.018263
                      var(_cons) |   3.74e-11   2.53e-11      9.92e-12    1.41e-10
    -----------------------------+------------------------------------------------
                   var(Residual) |   177.1193   1.247506      174.6911    179.5814
    ------------------------------------------------------------------------------
    LR test vs. linear model: chi2(3) = 5839.85               Prob > chi2 = 0.0000
    
    Note: LR test is conservative and provided only for reference.

  • #2
    When you have both a random slope and a random intercept, the random intercept variation depends on how the variable whose random slopes you are estimating is centered. It has no absolute interpretation. If you re-center the timelag variable at some other value, the var(_cons) term will change. In fact, if you have a particular value you would like it to be, there is a corresponding centering value for timelag that you can chose to get that. It's completely meaningless in absolute terms.

    To see this, forget about the rest of the model and focus on just the dtwtch-lagtime relationship. You are modeling it as a linear relationship, but it is a different linear relationship in different basinid groups. So if you were to select a few different basinids and graph the relationships to dtwtch you would get a cluster of different lines that look more or less like this graph: (I've made this graph from arbitrary numbers, so forget about the actual numbers on the axes..)

    Click image for larger version

Name:	fanning_lines.png
Views:	1
Size:	40.8 KB
ID:	1415583



    Now, notice the vertical lines that I've added. If lagtime is centered at 50, this is where all of these lines intersect and the intercepts are all the same, so the var(_cons) will be zero. But if you re-center lagtime to 25, or 10, you see that now the intercepts spread apart, at first somewhat narrowly, and then widely. So as you re-center lagtime away from that value of 50, var(_cons) increases, and you can see that you can actually get it to be any value you choose by picking the centering point.

    When you add new terms to a model (in this case the random slopes) that explain some of the variation that wasn't in the previous model, then the unexplained variation, aka var(Residual), goes down. So this is unsurprising

    Comment


    • #3
      Hey Clyde, very helpful. Thank you for your response!

      Comment

      Working...
      X