Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • mixed effects modelling using 'mixed'

    Hi

    Below are two outputs from mixed effects modelling using Stata's 'mixed' command (both models have a quadratic term for the time variable which is duration in months). The two models are identical in all respects (total N, number of observations, predictors controlled for etc). The only difference between the two models in the variable 'beta'; it is categorical in the first model but continuous in the second one (betax). Estimates for both fixed effects (all predictors) and random effects are pretty much the same in both models (except of course for the dka variable). The other difference in the constant. I'm unable to explain why there is such a large difference in the constant. The constant (89.4) in the first model (with categorical beta) is more realistic. The constant in the second model doesn't make any sense to me (too high). Can anyone help explain why there in a difference in the constants between the two models?

    Code:
    mixed hba1cifcc2 durationm durationm2 ib3.beta sex1 diagage i.ethnicnew4|| id: durationm, cov(unstr) mle var
    Code:
    Performing EM optimization:
     
    Performing gradient-based optimization:
     
    Iteration 0:   log likelihood = -3262.4239 
    Iteration 1:   log likelihood = -3262.3548 
    Iteration 2:   log likelihood = -3262.3548 
     
    Computing standard errors:
     
    Mixed-effects ML regression                     Number of obs      =       777
    Group variable: id                              Number of groups   =       341
     
                                                    Obs per group: min =         1
                                                                   avg =       2.3
                                                                   max =         6
     
     
                                                    Wald chi2(9)       =    617.58
    Log likelihood = -3262.3548                     Prob > chi2        =    0.0000
     
    -----------------------------------------------------------------------------------
           hba1cifcc2 |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    ------------------+----------------------------------------------------------------
            durationm |  -20.58428   .9159591   -22.47   0.000    -22.37953   -18.78903
           durationm2 |    2.53228   .1469149    17.24   0.000     2.244332    2.820228
                      |
                 beta |
         severe <7.1  |   5.582428    2.16643     2.58   0.010     1.336303    9.828553
    moderate 7.1-7.3  |   4.491778    1.73957     2.58   0.010     1.082283    7.901273
                      |
                 sex1 |  -.2651881   1.491139    -0.18   0.859    -3.187768    2.657392
              diagage |   .1718992   .1781936     0.96   0.335    -.1773538    .5211521
                      |
           ethnicnew |
               mixed  |   6.418724   2.681912     2.39   0.017     1.162273    11.67518
               Black  |   4.660201   1.970876     2.36   0.018     .7973555    8.523047
         asian-other  |   8.262195     1.8134     4.56   0.000     4.707997    11.81639
                      |
                _cons |   89.40784   3.413797    26.19   0.000     82.71692    96.09876
    -----------------------------------------------------------------------------------
     
    ------------------------------------------------------------------------------
      Random-effects Parameters  |   Estimate   Std. Err.     [95% Conf. Interval]
    -----------------------------+------------------------------------------------
    id: Unstructured             |
                   var(durati~m) |   19.04771   3.314386      13.54347    26.78895
                      var(_cons) |   304.2299    42.1261      231.9198    399.0856
             cov(durati~m,_cons) |  -61.08339   10.62844     -81.91474   -40.25203
    -----------------------------+------------------------------------------------
                   var(Residual) |   119.1177   11.36452      98.80222    143.6104
    ------------------------------------------------------------------------------
    LR test vs. linear regression:       chi2(3) =   122.30   Prob > chi2 = 0.0000
     
    Note: LR test is conservative and provided only for reference.
    Code:
    mixed hba1cifcc2 durationm durationm2 betax sex1 diagage i.ethnicnew4|| id: durationm, cov(unstr) mle var
    Code:
    Performing EM optimization:
     
    Performing gradient-based optimization:
     
    Iteration 0:   log likelihood = -3262.2805 
    Iteration 1:   log likelihood = -3262.2142 
    Iteration 2:   log likelihood = -3262.2142 
     
    Computing standard errors:
     
    Mixed-effects ML regression                     Number of obs      =       777
    Group variable: id                              Number of groups   =       341
     
                                                    Obs per group: min =         1
                                                                   avg =       2.3
                                                                   max =         6
     
     
                                                    Wald chi2(8)       =    619.87
    Log likelihood = -3262.2142                     Prob > chi2        =    0.0000
     
    ------------------------------------------------------------------------------
      hba1cifcc2 |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
       durationm |  -20.65295   .9154331   -22.56   0.000    -22.44717   -18.85874
      durationm2 |   2.542909    .146907    17.31   0.000     2.254977    2.830841
           betax |  -15.37712   4.722978    -3.26   0.001    -24.63399   -6.120257
            sex1 |  -.2493569    1.49386    -0.17   0.867     -3.17727    2.678556
         diagage |   .1491046   .1768503     0.84   0.399    -.1975156    .4957248
                 |
      ethnicnew |
          mixed  |   6.803754   2.671015     2.55   0.011      1.56866    12.03885
          Black  |    4.59069   1.967672     2.33   0.020     .7341242    8.447255
    asian-other  |   8.389881   1.816718     4.62   0.000     4.829179    11.95058
                 |
           _cons |   203.5288   34.00385     5.99   0.000     136.8825    270.1751
    ------------------------------------------------------------------------------
     
    ------------------------------------------------------------------------------
      Random-effects Parameters  |   Estimate   Std. Err.     [95% Conf. Interval]
    -----------------------------+------------------------------------------------
    id: Unstructured             |
                   var(durati~m) |   18.97781   3.306243      13.48817    26.70172
                      var(_cons) |   301.2808   41.93402      229.3487    395.7735
             cov(durati~m,_cons) |  -60.38939   10.58231     -81.13033   -39.64844
    -----------------------------+------------------------------------------------
                   var(Residual) |   118.8819    11.3258      98.63302    143.2878
    ------------------------------------------------------------------------------
    LR test vs. linear regression:       chi2(3) =   122.30   P

  • #2
    Amal:
    I cannot say whether what follows contribute to sniff out the culprit (if any), but the sign of -betax- is reversed vs -beta- (and both are statistcally significant, for what it worths).
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Remember that the constant term in a linear regression is the expected value of the outcome when all of the predictors are zero. For categorical variables like race, this means the reference (omitted) category. For continuous variables, it means they take on the literal value zero. Now, these two models differ in the way you represent the variable beta. In the first model, beta takes on three values: 0, 1, or 2. So, when beta = 0 (which is probably true of some appreciable number of observations in your data), the expected value is about 89. If everything else stays around 0, for beta = 1, the expected outcome is about 89 + 5 = 94, and for beta = 2 it is also around 89 + 5 = 94.

      Now let's look at what the second model says. I don't know what the distribution of the continuous variable betax is, but from the labels attached to the corresponding categorized variable beta, I'm guessing that its values are typically about 7. I'll hazard a further guess that beta is rarely, if ever, actually zero in the data. So the expected outcome when everything is set to zero is about 204, but that describes, at best, a very unusual circumstance and maybe one that never occurs, and perhaps is, in principle, impossible. So that 204 doesn't really mean very much. Let's take a person whose value of beta is 7, and has all other variables zero. Then the expected value of the outcome is now about 204 - 15.4*7 = 96.2. In your first model, this person's value of betax would be 1, so that model predics an expected value of 89.4 + 5.6 = 94: pretty much the same thing!

      So your two models are really predicting more or less the same things in the range of observed values of your data. They are nearly equivalent. You just have to understand that the constant terms mean something different in the two models because of the difference in the specification of beta. Once you put the constant term and the beta specification together for a more complete picture, you see that there is no conflict here.

      As an aside, you should not calculate a durationm2 variable for your quadratic term. Use factor variable notation and let Stata do the work for you. So remove both durationm and durationm2 from your equation and replace them by c.durationm##c.durationm. That way, when it comes time to predict and graph expected values, Stata will be able to do it for you the easy way with the -margins- command.

      Comment


      • #4
        Hi Clyde

        Thanks for the detailed reply, much appreciated! And it makes total sense. I feel quite silly now for not figuring this out on my own!!

        You're right - betax - the continuous variable has a rather narrow range from 6.7 to 7.7 with a mean of about 7.3. Values below 6.7 are clinically not possible (anyone with a lower value wouldn't be alive).

        So, I guess centering beta continuous around its lowest value of 6.7 would make it more straight forward to understand the constant (and for any graphs as well).
        Code:
        gen betacenter = beta-6.7
        Thanks once again!

        /Amal

        Comment

        Working...
        X