Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Understanding interaction terms in mixed effects modelling for longitudinal change in an out come - Growth curve analysis

    Hi

    I need some help in interpreting the coefficients for interaction terms in a mixed-effects model (longitudinal analysis) I've run to analyse change in my outcome over time (in months) given a set of predictors. I know this has been posted about before, but I'm still having difficulty in figuring out what's happening in my model! Here we have reason to believe that the main predictor - Dka - interacts with duration (time) in its effects on outcome.

    - Outcome in continuous
    - Duration is my time metric in months, and the model requires quadratic and cubic terms for duration. Duration runs from zero to 12 months
    - Main predictor of interest is dka with three categories (coded as 1, 2 and 3 with 3 set as the reference)
    - I know that the quadratic and cubic terms for duration should be entered differently in Stata, but for now I've entered it as below
    - A likelihood ratio test with and without the interaction terms is statistically significant, p>0.005


    The model run and its result:

    Code:
    mixed outcome durationm durationm2 durationm3 c.durationm##ib3.dka sex1 diagage i.ethnicn ib3.pdu || id: durationm, cov(unstr) mle var
    Code:
    Mixed-effects ML regression                     Number of obs      =      1288
    Group variable: id                              Number of groups   =       364
     
                                                    Obs per group: min =         1
                                                                   avg =       3.5
                                                                   max =         9
     
     
                                                    Wald chi2(14)      =    902.26
    Log likelihood = -5369.7496                     Prob > chi2        =    0.0000
     
    -----------------------------------------------------------------------------------
              outcome |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    ------------------+----------------------------------------------------------------
            durationm |   -20.2095   .8240737   -24.52   0.000    -21.82465   -18.59434
           durationm2 |   3.252352   .1656776    19.63   0.000      2.92763    3.577075
           durationm3 |  -.1467208   .0092647   -15.84   0.000    -.1648793   -.1285623
            durationm |          0  (omitted)
                      |
                 dka  |
         severe <7.1  |   10.03363   2.810954     3.57   0.000     4.524258      15.543
    moderate 7.1-7.3  |   6.121846    2.22295     2.75   0.006     1.764944    10.47875
                      |
     dka3#c.durationm |
         severe <7.1  |  -1.162192   .3961143    -2.93   0.003    -1.938562   -.3858225
    moderate 7.1-7.3  |  -.6533604   .3149256    -2.07   0.038    -1.270603   -.0361175
                      |
                 sex1 |  -.4225161   1.405661    -0.30   0.764    -3.177561    2.332528
              diagage |   .3344252   .1695414     1.97   0.049     .0021302    .6667203
                      |
           ethnicnew4 |
               mixed  |   6.124627   2.510784     2.44   0.015     1.203582    11.04567
               Black  |   4.876973   1.865152     2.61   0.009     1.221342    8.532604
         asian-other  |   6.035488   1.739225     3.47   0.001      2.62667    9.444306
                      |
                  pdu |
               PZ036  |   .3570729   1.846792     0.19   0.847    -3.262574     3.97672
               PZ058  |   5.234565   1.699423     3.08   0.002     1.903757    8.565374
                      |
                _cons |   85.66271   3.222403    26.58   0.000     79.34692    91.97851
    -----------------------------------------------------------------------------------
     
    ------------------------------------------------------------------------------
      Random-effects Parameters  |   Estimate   Std. Err.     [95% Conf. Interval]
    -----------------------------+------------------------------------------------
    id: Unstructured             |
                   var(durati~m) |   2.890782   .5151161      2.038626    4.099144
                      var(_cons) |   166.5512   24.03746      125.5158    221.0025
             cov(durati~m,_cons) |  -11.85645   2.938093       -17.615   -6.097888
    -----------------------------+------------------------------------------------
                   var(Residual) |   141.8493   8.059522      126.9007    158.5588
    ----------------------------------------------------------------------
    The coefficients for dka are statistically significant. Those with severe and moderate dka have higher values of outcome compared to those with normal dka (ref group), adjusting for sex, age at diagnosis, ethnicity and clinic when duration=0 (or at diagnosis in other words). Results are in the expected direction and effect. How would I interpret the coefficients for the interaction terms between dka and duration (-1.16 and -0.65)?

    Thanks!

    /Amal

  • #2
    The model is mis-specified and you should not interpret it at all.

    It does not make sense to interact dka with just duration, when duration^2 and duration^3 are also in the model. You need to rerun this as:

    Code:
    mixed outcome ib3.dka##c.durationm##c.durationm##c.durationm // etc.
    Then if you want to run a likelihood ratio test for the same model with just
    Code:
    mixed outcome ib3.dka cdurationm##c.durationm##c.durationm // etc.
    you can interpret that.

    It is really important to understand that when you include polynomial terms in a model, there is no meaning at all to any single one of those terms: they live and die together as an aggregate. Interpreting a single coefficient is essentially meaningless. Anything you do to one of them (i.e. interactions) you must do to all of them. [OK, this is a slight exaggeration: there are some very exotic circumstances where it could make sense to interact with just one of the degrees, but yours is not one of them. And they are sufficiently rare and esoteric that, for practical purposes, you should proceed as if there aren't any at all.]
    Last edited by Clyde Schechter; 19 Apr 2017, 11:12.

    Comment


    • #3
      Hi Clyde

      Thanks for your reply - that was very useful and noted (!) I ran the model again as you suggested, but with a slight change in order of the variables; dka after duration. Of course, I now get 6 coefficients for the three sets of interactions. I've also included the p-value for the LR test at the very end (now borderline statistically significant).

      In the model run without interaction terms, the coefficients for dka severe and moderate are much lower than below (4.4 and 3.0 respectively, compared to 13.7 & 5.9 below).

      Code:
      xtmixed outome c.durationm##c.durationm##c.durationm##ib3.dka3 sex1 diagage i.ethnicnew4 ib3.pdu|| id: durationm, cov(unstr) mle var
      Code:
      Performing EM optimization:
       
      Performing gradient-based optimization:
       
      Iteration 0:   log likelihood = -5368.4355 
      Iteration 1:   log likelihood = -5368.4194 
      Iteration 2:   log likelihood = -5368.4194 
       
      Computing standard errors:
       
      Mixed-effects ML regression                     Number of obs      =      1288
      Group variable: id                              Number of groups   =       364
       
                                                      Obs per group: min =         1
                                                                     avg =       3.5
                                                                     max =         9
       
       
                                                      Wald chi2(18)      =    907.97
      Log likelihood = -5368.4194                     Prob > chi2        =    0.0000
       
      ----------------------------------------------------------------------------------------------------------
                                       outcome |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
      -----------------------------------------+----------------------------------------------------------------
                                     durationm |  -19.89148   1.070369   -18.58   0.000    -21.98936   -17.79359
                                               |
                       c.durationm#c.durationm |   3.204204   .2200018    14.56   0.000     2.773008    3.635399
                                               |
           c.durationm#c.durationm#c.durationm |  -.1448496   .0124265   -11.66   0.000    -.1692052   -.1204941
                                               |
                                          dka3 |
                                  severe <7.1  |   13.73424   3.822537     3.59   0.000     6.242203    21.22627
                             moderate 7.1-7.3  |   5.903776   2.865402     2.06   0.039     .2876909    11.51986
                                               |
                              dka3#c.durationm |
                                  severe <7.1  |   -3.76887   2.406046    -1.57   0.117    -8.484633     .946893
                             moderate 7.1-7.3  |  -.5789495   1.863214    -0.31   0.756    -4.230781    3.072883
                                               |
                  dka3#c.durationm#c.durationm |
                                  severe <7.1  |   .3748766   .4807534     0.78   0.436    -.5673827    1.317136
                             moderate 7.1-7.3  |   .0030727   .3797409     0.01   0.994    -.7412057    .7473511
                                               |
      dka3#c.durationm#c.durationm#c.durationm |
                                  severe <7.1  |  -.0142254    .026613    -0.53   0.593     -.066386    .0379352
                             moderate 7.1-7.3  |  -.0008388   .0212101    -0.04   0.968    -.0424099    .0407323
                                               |
                                          sex1 |   -.421492   1.406737    -0.30   0.764    -3.178647    2.335663
                                       diagage |   .3291163   .1697083     1.94   0.052    -.0035058    .6617385
                                               |
                                    ethnicnew4 |
                                        mixed  |   6.105152   2.513132     2.43   0.015     1.179504     11.0308
                                        Black  |   4.854774    1.86698     2.60   0.009      1.19556    8.513987
                                  asian-other  |   5.983707   1.741036     3.44   0.001     2.571339    9.396075
                                               |
                                           pdu |
                                        PZ036  |   .3834102    1.84857     0.21   0.836    -3.239721    4.006542
                                        PZ058  |   5.229827   1.701028     3.07   0.002     1.895873    8.563781
                                               |
                                         _cons |   85.34188    3.29449    25.90   0.000     78.88479    91.79896
      ----------------------------------------------------------------------------------------------------------
       
      ------------------------------------------------------------------------------
        Random-effects Parameters  |   Estimate   Std. Err.     [95% Conf. Interval]
      -----------------------------+------------------------------------------------
      id: Unstructured             |
                     var(durati~m) |   2.898162   .5165877      2.043614    4.110044
                        var(_cons) |   166.8424   24.06536      125.7561    221.3522
               cov(durati~m,_cons) |  -11.85966   2.941085     -17.62408   -6.095237
      -----------------------------+------------------------------------------------
                     var(Residual) |   141.2169   8.039535       126.307    157.8868
      ------------------------------------------------------------------------------
      LR test vs. linear regression:       chi2(3) =   246.82   Prob > chi2 = 0.0000
       
      Note: LR test is conservative and provided only for reference.
       
      . estimates store F
       
      . lrtest E F
       
      Likelihood-ratio test                                 LR chi2(6)  =     12.83
      (Assumption: E nested in F)                           Prob > chi2 =    0.0457
      Thanks!

      /Amal

      Comment


      • #4
        In the model run without interaction terms, the coefficients for dka severe and moderate are much lower than below (4.4 and 3.0 respectively, compared to 13.7 & 5.9 below).
        In the model without interaction terms, the coefficients of the dka indicators are estimates of dka effect on your outcome. In the model with interaction terms, they do not have that meaning: there they mean the effect of dka conditional on duration = 0. duration = 0 may or may not be an interesting or meaningful value of duration. But even if it is, clearly you cannot directly compare the coefficients across the two models: its an apples vs oranges comparison.

        It's unfortunate that the likelihood ratio test for the interactions came out borderline, as it leaves one in a bit of a quandary about which model to use.

        One approach I like is to figure out how much of a contribution the interaction terms actually make compared to other things. Running graphs of predicted vs observed outcome values may give you some insight. Given the LR test result, I would lean towards including the interactions if the graphs show it to be even slightly better at predicting the outcome. But if there is really no discernible difference, I would feel comfortable going with the simpler non-interaction model despite the borderline statistical significance of the test.

        If you end up sticking with the interaction model, remember that you are committing yourself to saying that there is no such thing as "the effect of dka." Rather you are saying that there are many effects of dka, which depend on duration and you need to present multiple results either graphically or in tables of values of dka effect corresponding to an interesting set of values of duration. (The -margins- command is your friend here. If you are not familiar with it, I think the best introduction out there is Richard Williams' http://www.stata-journal.com/sjpdf.h...iclenum=st0260. The manual section is well written and has numerous worked examples as welll, but it goes beyond the SJ article in both breadth and depth and is a heavier lift. The SJ article includes what you need to know for your immediate purposes. After you've mastered that, you can learn the rest from the manual chapter for future, more advanced, purposes.)

        Comment


        • #5
          Hi Clyde

          Thanks for the feedback. Just want to clarify: In the model without interaction terms, the coefficient terms for dka categories is the difference in outcome for each dka category compared to the reference dka group at time=0, (since duration is centered at zero). It should be same interpretation for all other covariates included in the model, including the variables for duration (i.e. a -19.9 reduction in outcome between 0 and 1 month). So, I'm not sure how the coefficients differ for dka alone in the model above with interaction terms (13.7 & 5.9) if these are also conditional on duration=0.

          Assuming I consider the model above with interactions terms as my final model, I conclude that dka interacts with duration in its effect on outcome and I want to visually present this. My main predictor of interest is dka, so I only want to show three lines for the three dka categories, adjusted for other predictors in the model above (age, sex, ethnicity & pdu). I use margins and marginsplot as below:

          Code:
           
           xtmixed outome c.durationm##c.durationm##c.durationm##ib3.dka3 sex1 diagage i.ethnicnew4 ib3.pdu|| id: durationm, cov(unstr) mle var
          (Model results above)
          Code:
          margins dka3, at(durationm=(0 1 2 3 4 5 6 7 8 9 10 11 12))
          marginsplot, noci
          I've attached the resulting graph... which shows the lines overlapping after 4 months indicating the interaction between dka and duration (otherwise the graph would have had three parallel lines had I rejected the interaction model and gone with the simpler model).

          If I understand correctly, the syntax for magins above calculates the adjusted margins for pH by dka, so it represents the model above? So this is one advantage of margins/marginsplot over say using the twoway function command to plot the coefficients directly from the model (as is common with growth curve modelling, but is restricted to certain subpopulations of the study pop).

          I tried to calculate MERs for the same model above, but got an error msg that option(dydx) is not allowed:

          Code:
          margins, dydx(dka3) at(durationm=(0 1 2 3 4 5 6 7 8 9 10 11 12))
          Thanks!

          /Amal

          Attached Files

          Comment


          • #6
            I forgot to add that
            Code:
             
             margins dka3, at(durationm=(0 1 2 3 4 5 6 7 8 9 10 11 12)), atmeans
            gives the same as the margins command I used above, which I guess is the default.

            /Amal

            Comment


            • #7
              Just want to clarify: In the model without interaction terms, the coefficient terms for dka categories is the difference in outcome for each dka category compared to the reference dka group at time=0, (since duration is centered at zero).
              This is correct, but it leaves unstated the fact in the model without interaction terms, the coefficients of the dka categories are the difference in outcome for each dka category compared to the reference dka group at all times because without interaction terms, in a linear model, the effect of one variable does not depend on anything, it's a constant. It is an unconditional effect. By omitting interaction terms, the model is forced to find a single overall effect value for each dka category that works out as best it can at all values of duration. When the reality is that the effect actually does vary with duration, the result that you get represents some sort of average of the different values of the effects at different times. The coefficients are the compromises that are "least bad fit" for the mis-specified model. What you get is a glove of the particular size that comes closest to fitting on your foot. Don't confuse it with a sock.

              So, I'm not sure how the coefficients differ for dka alone in the model above with interaction terms (13.7 & 5.9) if these are also conditional on duration=0.
              By contrast, in the model with interaction terms, the effects of each level of dka are different for different values of duration. Because of the way things are represented in the model, you can get the effect of a dka level when duration = 0 just by reading off the coefficient of that dka level: but this coefficient no longer represents the effect of the dka level at other values of duration. It is strictly conditional on duration = 0.

              If I understand correctly, the syntax for magins above calculates the adjusted margins for pH by dka, so it represents the model above?
              It represents the adjusted predicted values for outcome. I have a hard time imagining that the outcome is pH since in the graph it ranges from 60 to about 110, which is way outside of the physical range of pH.

              I've attached the resulting graph... which shows the lines overlapping after 4 months indicating the interaction between dka and duration (otherwise the graph would have had three parallel lines had I rejected the interaction model and gone with the simpler model).
              Correct.

              So this is one advantage of margins/marginsplot over say using the twoway function command to plot the coefficients directly from the model (as is common with growth curve modelling, but is restricted to certain subpopulations of the study pop).
              Yes.
              I tried to calculate MERs for the same model above, but got an error msg that option(dydx) is not allowed:

              margins, dydx(dka3) at(durationm=(0 1 2 3 4 5 6 7 8 9 10 11 12))
              I don't know why you are getting that. The syntax looks correct, and you should be able to do this. Are you perhaps using an earlier version of Stata? (Current is 14.2.) I do this sort of thing frequently and it's not a problem.

              Last edited by Clyde Schechter; 20 Apr 2017, 16:05.

              Comment

              Working...
              X