Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Mixed Models and marginal effects

    Hello everyone!

    I am working with Mixed Models. I am using Stata13 (Windows 10).

    I have data from a longitudinal cohort study. We have measured certain anthropometrical measurements (such as weight, height, BMI, waist circumference etc) in the same subjects at three different time points (ages 15 years, 18 years and 25 years) and want to know if there is a difference in anthropometrical measurements between a certain transcription factor genotypes.

    My model looks like this:

    Code:
    . mixed BMI time##ib3.AP2bgeno if sex==1 || ID: time, reml cov(unstructured)
    
    Performing EM optimization:
    
    Performing gradient-based optimization:
    
    Iteration 0:   log restricted-likelihood = -2682.4845  
    Iteration 1:   log restricted-likelihood =  -2678.653  
    Iteration 2:   log restricted-likelihood = -2678.5355  
    Iteration 3:   log restricted-likelihood = -2678.5355  
    
    Computing standard errors:
    
    Mixed-effects REML regression                   Number of obs      =      1177
    Group variable: ID                              Number of groups   =       494
    
                                                    Obs per group: min =         1
                                                                   avg =       2.4
                                                                   max =         3
    
    
                                                    Wald chi2(8)       =   1352.10
    Log restricted-likelihood = -2678.5355          Prob > chi2        =    0.0000
    
    -------------------------------------------------------------------------------
              BMI |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    --------------+----------------------------------------------------------------
             time |
              18  |   2.244071   .1322267    16.97   0.000     1.984911     2.50323
              25  |   4.810346    .173599    27.71   0.000     4.470099    5.150594
                  |
         AP2bgeno |
               1  |  -.5558766   .5341458    -1.04   0.298    -1.602783    .4910299
               2  |  -.6856956   .2769243    -2.48   0.013    -1.228457   -.1429338
                  |
    time#AP2bgeno |
            18 1  |  -.1380638   .3919103    -0.35   0.725    -.9061938    .6300663
            18 2  |  -.0047683   .2105343    -0.02   0.982     -.417408    .4078714
            25 1  |  -.7752342   .5282823    -1.47   0.142    -1.810648    .2601801
            25 2  |  -.1308186    .276787    -0.47   0.636    -.6733111    .4116738
                  |
            _cons |    20.6324   .1779103   115.97   0.000      20.2837    20.98109
    -------------------------------------------------------------------------------
    
    ------------------------------------------------------------------------------
      Random-effects Parameters  |   Estimate   Std. Err.     [95% Conf. Interval]
    -----------------------------+------------------------------------------------
    ID: Unstructured             |
                       var(time) |   .0340546   .0054172      .0249329    .0465135
                      var(_cons) |   10.74794   1.947627      7.534937    15.33102
                 cov(time,_cons) |  -.3869099   .0950002     -.5731068    -.200713
    -----------------------------+------------------------------------------------
                   var(Residual) |   1.549076   .1276659      1.318019    1.820639
    ------------------------------------------------------------------------------
    LR test vs. linear regression:       chi2(3) =   732.49   Prob > chi2 = 0.0000
    
    Note: LR test is conservative and provided only for reference.
    
    .
    . margins time#AP2bgeno, vsquish
    
    Adjusted predictions                              Number of obs   =       1177
    
    Expression   : Linear prediction, fixed portion, predict()
    
    -------------------------------------------------------------------------------
                  |            Delta-method
                  |     Margin   Std. Err.      z    P>|z|     [95% Conf. Interval]
    --------------+----------------------------------------------------------------
    time#AP2bgeno |
            15 1  |   20.07652   .5036464    39.86   0.000     19.08939    21.06365
            15 2  |    19.9467   .2122145    93.99   0.000     19.53077    20.36263
            15 3  |    20.6324   .1779103   115.97   0.000      20.2837    20.98109
            18 1  |   22.18253   .5488286    40.42   0.000     21.10684    23.25821
            18 2  |     22.186   .2405445    92.23   0.000     21.71454    22.65746
            18 3  |   22.87647   .1964405   116.45   0.000     22.49145    23.26148
            25 1  |   24.11163   .6839281    35.25   0.000     22.77116    25.45211
            25 2  |   24.62623   .2947946    83.54   0.000     24.04844    25.20402
            25 3  |   25.44274   .2398035   106.10   0.000     24.97274    25.91275
    -------------------------------------------------------------------------------
    
    .
    . marginsplot
    
      Variables that uniquely identify margins: time AP2bgeno
    
    .
    . margins time, dydx(AP2bgeno )
    
    Conditional marginal effects                      Number of obs   =       1177
    
    Expression   : Linear prediction, fixed portion, predict()
    dy/dx w.r.t. : 1.AP2bgeno 2.AP2bgeno
    
    ------------------------------------------------------------------------------
                 |            Delta-method
                 |      dy/dx   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
    1.AP2bgeno   |
            time |
             15  |  -.5558766   .5341458    -1.04   0.298    -1.602783    .4910299
             18  |  -.6939404   .5829251    -1.19   0.234    -1.836453    .4485718
             25  |  -1.331111   .7247506    -1.84   0.066    -2.751596    .0893742
    -------------+----------------------------------------------------------------
    2.AP2bgeno   |
            time |
             15  |  -.6856956   .2769243    -2.48   0.013    -1.228457   -.1429338
             18  |  -.6904639   .3105649    -2.22   0.026     -1.29916   -.0817679
             25  |  -.8165142   .3800126    -2.15   0.032    -1.561325   -.0717032
    ------------------------------------------------------------------------------
    Note: dy/dx for factor levels is the discrete change from the base level.
    
    .
    I have a question about how to interpret the model and the marginal effects (ME).

    Can I say: The linear mixed-effects regression model showed a significant difference in BMI between male subjects with AP2Bgeno group 2 and AP2Bgeno grooup 3 (p = 0.013). Group 2 compared to group 3, had significantly lower BMI at ages 15 years (ME 68.6 percentage points, p = 0.013), 18 years (ME 69.0 percentage points, p = 0.026) and 25 years (ME 81.7 percentage points, p = 0.032).

    Should I present the p value or CI with ME?


    I greatly appreciate all the help.

    Best regards,
    Urmeli

  • #2
    Your model is mis-specified and cannot be interpreted at all.

    The problem is that in the bottom level of the model you have treated time implicitly as discrete (because in interaction terms, all variables are assumed discrete unless explicitly prefixed with c.), but in the random slopes level you have treated it as continuous. So Stata is calculating random slope for a variable that does not exist in your model. The results you have are not meaningful.

    I do not know whether your intent is to treat time as continuous or discrete. If you want to treat it as continuous then your command should be:

    Code:
    mixed BMI c.time##ib3.AP2bgeno if sex==1 || ID: time, reml cov(unstructured)
    If your intent is to have time as a discrete variable, then I'm afraid you will have to resort to using the old -xi- command, because the random slopes mechanism in -mixed- does not support factor variable notation. So it will be something like this.

    Code:
    xi i.time*i.AP2bgeno
    mixed BMI _I* if sex == 1 || ID: _Itime_*, reml cov(unstructured)
    And unfortunately, you will not be able to use -margins- after this and will have to calculate the marginal effects directly.

    Comment


    • #3
      Dear Clyde!

      Thank You very much for Your responce!

      Considering my aime - to see if there is a difference in BMI between genotypes and if the effect differes in time - how should I treat time? I was in the impression that I should treat it as continous?

      Treating time as continous versus discrete - how does the interpretation change?


      I am truly grateful for Your advice!

      Best regards,
      Urmeli

      Comment


      • #4
        If you treat time as a continuous variable, your model will be based on the assumption that, on average, BMI increases linearly over time, all else being equal. In your situation, with 3 time periods, numbered 15, 18, and 25, that means that you expect the change in average BMI from time 18 to 25 to be approximately 7 (= 25 -18) / 3 (= 18 -15) times the change in average BMI from time 15 to 18. In this situation, the regression coefficient of c.time will represent the rate of change in BMI per unit of time (conditional on AP2bgeno = its base category.) Similarly, the marginal effects in the outcomes of -margins- will show you the rates of change in BMI per unit of time in the various categories of AP2bgeno.

        If you treat time as discrete, then you are imposing no constraints on how the change in BMI from one time to another relates to the change in BMI from any other time to yet any other. The change might be linear, or it could be V-shaped, or inverted V-shaped. In this case, because you can't use factor-variable notation for random slopes, you can't use -margins- after estimation and will have to calculate marginal effects using -lincom-. But the "marginal effects" given by the coefficients of the time indicator variables will instead by differences in average BMI associated with the each time period, compared to the baseline (time = 15) time.

        Comment


        • #5
          Dear Clyde!

          I am so grateful for your thorough answer. It has helped me very much. I tried to run some simple models (without interactions) with time as continuous. When looking at a simple model like this:

          Code:
          . mixed BMI c.time ib3.AP2bgeno if sex==1 || ID: time, reml cov(unstructured)
          
          Performing EM optimization:
          
          Performing gradient-based optimization:
          
          Iteration 0:   log restricted-likelihood = -2719.5264  
          Iteration 1:   log restricted-likelihood = -2719.0861  
          Iteration 2:   log restricted-likelihood = -2718.7041  
          Iteration 3:   log restricted-likelihood = -2718.7039  
          
          Computing standard errors:
          
          Mixed-effects REML regression                   Number of obs      =      1177
          Group variable: ID                              Number of groups   =       494
          
                                                          Obs per group: min =         1
                                                                         avg =       2.4
                                                                         max =         3
          
          
                                                          Wald chi2(3)       =   1240.24
          Log restricted-likelihood = -2718.7039          Prob > chi2        =    0.0000
          
          ------------------------------------------------------------------------------
                   BMI |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
          -------------+----------------------------------------------------------------
                  time |   .4590506   .0130846    35.08   0.000     .4334052    .4846959
                       |
              AP2bgeno |
                    1  |  -.5275102   .5167235    -1.02   0.307     -1.54027    .4852492
                    2  |  -.7154496   .2697477    -2.65   0.008    -1.244145   -.1867539
                       |
                 _cons |   14.02651   .2672339    52.49   0.000     13.50274    14.55028
          ------------------------------------------------------------------------------
          
          ------------------------------------------------------------------------------
            Random-effects Parameters  |   Estimate   Std. Err.     [95% Conf. Interval]
          -----------------------------+------------------------------------------------
          ID: Unstructured             |
                             var(time) |   .0263723   .0057693      .0171766     .040491
                            var(_cons) |   7.639615   2.099446      4.458126    13.09153
                       cov(time,_cons) |  -.2382201   .1025091     -.4391343   -.0373059
          -----------------------------+------------------------------------------------
                         var(Residual) |   1.973555   .1608257      1.682226    2.315335
          ------------------------------------------------------------------------------
          LR test vs. linear regression:       chi2(3) =   673.55   Prob > chi2 = 0.0000
          
          Note: LR test is conservative and provided only for reference.
          
          .
          Should I say that:
          The rate of change in BMI is 0.46 (95% CI 0.43, 0.48) kg/m2 per year. BMI is 0.72 kg/m2 (95% CI -1.24, -0.19) smaller in genotype 2 compared to genotype 3??

          But I am still confused about the marginal effects. How can I calculate marginal effects for this simplified model? I can not figure it out, because I am not using time as discrete variable anymore, so I don´t have 3 time points.

          I cannot thank You enough!

          Best regards
          Urmeli

          Comment


          • #6
            Should I say that:
            The rate of change in BMI is 0.46 (95% CI 0.43, 0.48) kg/m2 per year. BMI is 0.72 kg/m2 (95% CI -1.24, -0.19) smaller in genotype 2 compared to genotype 3??
            Those are correct interpretations of your results.

            But I am still confused about the marginal effects. How can I calculate marginal effects for this simplified model? I can not figure it out, because I am not using time as discrete variable anymore, so I don´t have 3 time points.
            This being a purely linear model, the marginal effects are equal to the coefficients. The marginal effect of time is an increase of 0.46 kg/m2 per unit of time (whatever your time unit is: month, year, decade...)


            Comment


            • #7
              So in another words the rate of change in BMI per year is significantly smaller in genotype 2 compared to genotype 3 (ME 0.72 [95% CI -1.24, -0.19])?

              Or should the marginal effects (ME) then be presented as percentage points?

              Comment


              • #8
                So in another words the rate of change in BMI per year is significantly smaller in genotype 2 compared to genotype 3 (ME 0.72 [95% CI -1.24, -0.19])?
                No! The BMI itself is "significantly" smaller in genotype 2 than it is in genotype 3. Your model does not incorporate any interaction between genotype and time, so in your model the rate of change of BMI over time is explicitly stipulated to be the same in all three genotypes. If you want a model in which the rate of change of BMI over time can differ among the three groups, then you need a different regression:

                Code:
                mixed BMI c.time##ib3.AP2bgeno if sex==1 || ID: time, reml cov(unstructured)
                margins AP2bgeno, dydx(time)
                This model, which includes a time # genotype interaction, will allow the rate of increase of BMI over time to differ among the genotypes. The mean rates of change of BMI over time in each genotype will appear in the output of the -margins- command. If you wish to test whether the differences among those rates are "significant," that will appear in the -mixed- output in the rows for the interaction terms themselves.

                Comment


                • #9
                  Okay, finally got it! Thank you so much for your time, I really appreciate it!

                  Comment


                  • #10
                    Hi Clyde
                    Assuming the -mixed- output with interaction terms is what described by Uremli Joost first output (which is wrong). From that -mixed- output interaction term and -margins- output, at age 25 there was a reduction in BMI in in Group 1 ( -1.331111, p=0.06) compared to baseline age. Similarly, in Group 2 BMI reduction was (-.8165142 , p=0.032) compared to baseline.
                    How to compare the decrease in Group 1 & Group 2 at age 25 are statistically significant? Thank you.

                    Comment


                    • #11
                      Hi Clyde
                      Appreciate your help with interpretation of -mixed- output on detecting statistical significance between Group 1 & Group 2 at age 25.

                      Comment


                      • #12
                        Add the -pwcompare- option to the -margins time, dydx(Apg2beno)- command.

                        I notice you have "bumped" your question about an hour after you originally posted it. Please don't do that. This is not a help line. It is not manned by anyone 24 hours a day. As it happens, the hour at which you posted it is the middle of the night where I am. And there are days that I am not on Statalist at all, sometimes even for many days running. In fact, by mentioning me by name, you actually increased the probability that your response will be delayed. There are many Forum members who could have answered this question for you. But seeing it addressed to me, they may have passed it by.

                        More generally, it is not reasonable to expect a response to a post within one hour here. If you are lucky, that can happen. But it is not the norm, and should not be expected. If you have a post that draws no response within 24 hours, then it is likely going to remain unanswered. Bumping it will usually not help. What could help is re-writing the post so that it is clearer and more focused, providing example data and relevant code attempted if you did not do so originally, and adhering to the extremely helpful guidance offered in the Forum FAQ.

                        Comment


                        • #13
                          Hi Cylde
                          My apology. Thank you so much for your answer.

                          Comment

                          Working...
                          X