mixed effects modelling using 'mixed'

Amal Khanolkar

Join Date: Feb 2015
Posts: 142

mixed effects modelling using 'mixed'

11 Apr 2017, 08:09

Hi

Below are two outputs from mixed effects modelling using Stata's 'mixed' command (both models have a quadratic term for the time variable which is duration in months). The two models are identical in all respects (total N, number of observations, predictors controlled for etc). The only difference between the two models in the variable 'beta'; it is categorical in the first model but continuous in the second one (betax). Estimates for both fixed effects (all predictors) and random effects are pretty much the same in both models (except of course for the dka variable). The other difference in the constant. I'm unable to explain why there is such a large difference in the constant. The constant (89.4) in the first model (with categorical beta) is more realistic. The constant in the second model doesn't make any sense to me (too high). Can anyone help explain why there in a difference in the constants between the two models?

Code:

mixed hba1cifcc2 durationm durationm2 ib3.beta sex1 diagage i.ethnicnew4|| id: durationm, cov(unstr) mle var

Code:

Performing EM optimization:
 
Performing gradient-based optimization:
 
Iteration 0:   log likelihood = -3262.4239 
Iteration 1:   log likelihood = -3262.3548 
Iteration 2:   log likelihood = -3262.3548 
 
Computing standard errors:
 
Mixed-effects ML regression                     Number of obs      =       777
Group variable: id                              Number of groups   =       341
 
                                                Obs per group: min =         1
                                                               avg =       2.3
                                                               max =         6
 
 
                                                Wald chi2(9)       =    617.58
Log likelihood = -3262.3548                     Prob > chi2        =    0.0000
 
-----------------------------------------------------------------------------------
       hba1cifcc2 |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
------------------+----------------------------------------------------------------
        durationm |  -20.58428   .9159591   -22.47   0.000    -22.37953   -18.78903
       durationm2 |    2.53228   .1469149    17.24   0.000     2.244332    2.820228
                  |
             beta |
     severe <7.1  |   5.582428    2.16643     2.58   0.010     1.336303    9.828553
moderate 7.1-7.3  |   4.491778    1.73957     2.58   0.010     1.082283    7.901273
                  |
             sex1 |  -.2651881   1.491139    -0.18   0.859    -3.187768    2.657392
          diagage |   .1718992   .1781936     0.96   0.335    -.1773538    .5211521
                  |
       ethnicnew |
           mixed  |   6.418724   2.681912     2.39   0.017     1.162273    11.67518
           Black  |   4.660201   1.970876     2.36   0.018     .7973555    8.523047
     asian-other  |   8.262195     1.8134     4.56   0.000     4.707997    11.81639
                  |
            _cons |   89.40784   3.413797    26.19   0.000     82.71692    96.09876
-----------------------------------------------------------------------------------
 
------------------------------------------------------------------------------
  Random-effects Parameters  |   Estimate   Std. Err.     [95% Conf. Interval]
-----------------------------+------------------------------------------------
id: Unstructured             |
               var(durati~m) |   19.04771   3.314386      13.54347    26.78895
                  var(_cons) |   304.2299    42.1261      231.9198    399.0856
         cov(durati~m,_cons) |  -61.08339   10.62844     -81.91474   -40.25203
-----------------------------+------------------------------------------------
               var(Residual) |   119.1177   11.36452      98.80222    143.6104
------------------------------------------------------------------------------
LR test vs. linear regression:       chi2(3) =   122.30   Prob > chi2 = 0.0000
 
Note: LR test is conservative and provided only for reference.

Code:

mixed hba1cifcc2 durationm durationm2 betax sex1 diagage i.ethnicnew4|| id: durationm, cov(unstr) mle var

Code:

Performing EM optimization:
 
Performing gradient-based optimization:
 
Iteration 0:   log likelihood = -3262.2805 
Iteration 1:   log likelihood = -3262.2142 
Iteration 2:   log likelihood = -3262.2142 
 
Computing standard errors:
 
Mixed-effects ML regression                     Number of obs      =       777
Group variable: id                              Number of groups   =       341
 
                                                Obs per group: min =         1
                                                               avg =       2.3
                                                               max =         6
 
 
                                                Wald chi2(8)       =    619.87
Log likelihood = -3262.2142                     Prob > chi2        =    0.0000
 
------------------------------------------------------------------------------
  hba1cifcc2 |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
   durationm |  -20.65295   .9154331   -22.56   0.000    -22.44717   -18.85874
  durationm2 |   2.542909    .146907    17.31   0.000     2.254977    2.830841
       betax |  -15.37712   4.722978    -3.26   0.001    -24.63399   -6.120257
        sex1 |  -.2493569    1.49386    -0.17   0.867     -3.17727    2.678556
     diagage |   .1491046   .1768503     0.84   0.399    -.1975156    .4957248
             |
  ethnicnew |
      mixed  |   6.803754   2.671015     2.55   0.011      1.56866    12.03885
      Black  |    4.59069   1.967672     2.33   0.020     .7341242    8.447255
asian-other  |   8.389881   1.816718     4.62   0.000     4.829179    11.95058
             |
       _cons |   203.5288   34.00385     5.99   0.000     136.8825    270.1751
------------------------------------------------------------------------------
 
------------------------------------------------------------------------------
  Random-effects Parameters  |   Estimate   Std. Err.     [95% Conf. Interval]
-----------------------------+------------------------------------------------
id: Unstructured             |
               var(durati~m) |   18.97781   3.306243      13.48817    26.70172
                  var(_cons) |   301.2808   41.93402      229.3487    395.7735
         cov(durati~m,_cons) |  -60.38939   10.58231     -81.13033   -39.64844
-----------------------------+------------------------------------------------
               var(Residual) |   118.8819    11.3258      98.63302    143.2878
------------------------------------------------------------------------------
LR test vs. linear regression:       chi2(3) =   122.30   P

Tags: None

Carlo Lazzaro

Join Date: Apr 2014

Posts: 17712
#2

11 Apr 2017, 08:43

Amal:
I cannot say whether what follows contribute to sniff out the culprit (if any), but the sign of -betax- is reversed vs -beta- (and both are statistcally significant, for what it worths).

Kind regards,
Carlo
(Stata 19.0)
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30117
#3

11 Apr 2017, 09:11

Remember that the constant term in a linear regression is the expected value of the outcome when all of the predictors are zero. For categorical variables like race, this means the reference (omitted) category. For continuous variables, it means they take on the literal value zero. Now, these two models differ in the way you represent the variable beta. In the first model, beta takes on three values: 0, 1, or 2. So, when beta = 0 (which is probably true of some appreciable number of observations in your data), the expected value is about 89. If everything else stays around 0, for beta = 1, the expected outcome is about 89 + 5 = 94, and for beta = 2 it is also around 89 + 5 = 94.

Now let's look at what the second model says. I don't know what the distribution of the continuous variable betax is, but from the labels attached to the corresponding categorized variable beta, I'm guessing that its values are typically about 7. I'll hazard a further guess that beta is rarely, if ever, actually zero in the data. So the expected outcome when everything is set to zero is about 204, but that describes, at best, a very unusual circumstance and maybe one that never occurs, and perhaps is, in principle, impossible. So that 204 doesn't really mean very much. Let's take a person whose value of beta is 7, and has all other variables zero. Then the expected value of the outcome is now about 204 - 15.4*7 = 96.2. In your first model, this person's value of betax would be 1, so that model predics an expected value of 89.4 + 5.6 = 94: pretty much the same thing!

So your two models are really predicting more or less the same things in the range of observed values of your data. They are nearly equivalent. You just have to understand that the constant terms mean something different in the two models because of the difference in the specification of beta. Once you put the constant term and the beta specification together for a more complete picture, you see that there is no conflict here.

As an aside, you should not calculate a durationm2 variable for your quadratic term. Use factor variable notation and let Stata do the work for you. So remove both durationm and durationm2 from your equation and replace them by c.durationm##c.durationm. That way, when it comes time to predict and graph expected values, Stata will be able to do it for you the easy way with the -margins- command.
1 like
Comment
Amal Khanolkar

Join Date: Feb 2015

Posts: 142
#4

11 Apr 2017, 10:34

Hi Clyde

Thanks for the detailed reply, much appreciated! And it makes total sense. I feel quite silly now for not figuring this out on my own!!

You're right - betax - the continuous variable has a rather narrow range from 6.7 to 7.7 with a mean of about 7.3. Values below 6.7 are clinically not possible (anyone with a lower value wouldn't be alive).

So, I guess centering beta continuous around its lowest value of 6.7 would make it more straight forward to understand the constant (and for any graphs as well).

Code:

gen betacenter = beta-6.7

Thanks once again!

/Amal
Comment

Announcement

mixed effects modelling using 'mixed'

Comment

Comment

Comment