Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Visualizing results after running linear piece-wise spline modelling

    Hi All

    I'm analyzing the association between y (a clinical variable) and time (time since diagnosis of diabetes), adjusting for gender and at age at diagnosis. We already know from existing literature and the raw data, that y has a non-linear relationship with time since diagnosis. I've fit a multilevel piece-wise linear spline model to account for the non-linearity of the data with knots at 1, 2, 4, 8, 12, 17, 24 & 34 months after diagnosis (total follow-up time 72 months). The data is longitudinal with subjects having multiple measurements of Y, so essentially this is a growth curve analysis (random slope & random coefficient model):

    Code:
    mkspline durm1 1 durm2 2 durm4 4 durm8 8 durm12 12 durm17 17 durm24 24 durm34 34 durm36 = durationm
     
    mixed y durm1 durm2 durm4 durm8 durm12 durm17 durm24 durm34 durm36 sexnew diagagenew || id: durationm, cov(unstr) mle, if diabtype==1 & audityr>2010 & durationall<6.1
     
    Performing EM optimization:
     
    Performing gradient-based optimization:
     
    Iteration 0:   log likelihood = -739203.39 
    Iteration 1:   log likelihood = -739203.15 
    Iteration 2:   log likelihood = -739203.15 
     
    Computing standard errors:
     
    Mixed-effects ML regression                     Number of obs     =    187,201
    Group variable: id                              Number of groups  =     25,677
     
                                                    Obs per group:
                                                                  min =          1
                                                                  avg =        7.3
                                                                  max =         36
     
                                                    Wald chi2(11)     =   36063.34
    Log likelihood = -739203.15                     Prob > chi2       =     0.0000
     
    ------------------------------------------------------------------------------
                 y |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
           durm1 |   -18.9578   .2976129   -63.70   0.000    -19.54111   -18.37449
           durm2 |  -11.29657   .3486496   -32.40   0.000    -11.97991   -10.61323
           durm4 |  -4.899221   .1451236   -33.76   0.000    -5.183658   -4.614784
           durm8 |    1.85443   .0570494    32.51   0.000     1.742615    1.966245
          durm12 |    .414913   .0476722     8.70   0.000     .3214771    .5083489
          durm17 |   .2890149   .0343775     8.41   0.000     .2216362    .3563936
          durm24 |   .2048707   .0219777     9.32   0.000     .1617952    .2479462
          durm34 |   .1576019   .0133097    11.84   0.000     .1315154    .1836883
          durm36 |   .0369482   .0055738     6.63   0.000     .0260238    .0478727
          sexnew |   1.399622   .1986482     7.05   0.000     1.010279    1.788965
      diagagenew |   .4907482   .0241491    20.32   0.000     .4434168    .5380795
           _cons |   91.32118   .4392206   207.92   0.000     90.46032    92.18204
    ------------------------------------------------------------------------------
     
    ------------------------------------------------------------------------------
      Random-effects Parameters  |   Estimate   Std. Err.     [95% Conf. Interval]
    -----------------------------+------------------------------------------------
    id: Unstructured             |
                   var(durati~m) |   .1717863   .0028512      .1662881    .1774663
                      var(_cons) |   262.7687   3.711764      255.5935    270.1452
             cov(durati~m,_cons) |  -3.667118   .0833349     -3.830452   -3.503785
    -----------------------------+------------------------------------------------
                   var(Residual) |   99.06962   .3773554      98.33277    99.81199
    ------------------------------------------------------------------------------
    LR test vs. linear model: chi2(3) = 1.3e+05               Prob > chi2 = 0.0000
     
    Note: LR test is conservative and provided only for reference.

    I would now like to visualize the data post regression - plot y with time since diagnosis first for all subjects and then separately for males and females. What is the best way to do this?

    I've been told margins and marginsplot (which I generally use) doesn't work too well for linear spline models. Is this true? There is a user written package to help plot data post regression when using restricted cubic splines (the 'postrcspline' package).

    Is there a way to plot what I would like to see? Due to the nature of the data, it is stored on a secure online server and I'm unable to download and run any user-written commands which complicates matters.....

    Thanks!
    /Amal

  • #2
    Before you proceed, your model is mis-specified. You should not attempt to interpret or plot anything from these results.

    The problem is that you have asked Stata to fit random slopes on the variable durationm, but durationm does not appear among the fixed effects in the model. Consequently your random slopes are calculated subject to the constraint that the mean slope on durationm is zero, which I suspect is not even remotely realistic. I realize that durationm is indirectly represented in the fixed effects by the sline variables durm*. But that does not do the trick needed here.

    I think what you need to do is include random slopes on the durm* variables instead of on durationm.

    Note, however, that with 9 such random slopes, the use of cov(unstr) is going to force Stata to try to estimate a 9x9 covariance matrix. That's 81 extra parameters in your model. While your data set looks like it may be large enough to support that, it is going to take a very long time to do all that computing. Are you really interested in these 81 parameters? (I'm guessing you're not, given that you didn't even plan to include them in the model.) If not, I would simplify the model to just estimate cov(independent) or, if you have some interest in the approximate covariance among these random slopes, perhaps cov(exchangeable).

    Comment


    • #3
      Hi Clyde

      Thanks so much - What you say make total sense and the spline variables (durm1 to durm36) should be included in the random part of the model. It substantially increase the number parameters to be tested and with nine knots (plus other categorical variables that I will eventually add to model in addition to interactions) makes things quite complicated. So, I decided to take a step back and keep things simple for now. I've run two 'simpler' random intercept only models with few knots - 1. One model for the overall trajectory of y and 2. the same model but including interactions with gender (a likelihood ratio test shows the model with interactions has a significantly better fit. Girls have a higher y at at time 0 and marginally higher during follow-up):

      Code:
       
       mkspline durm2 2 durm8 8 durm16 16 durm28 28 durm38 = durationm eststo: mixed hba1cmol durm2 durm8 durm16 durm28 durm38 sexnew diagagenew || id:, cov(unstr) mle, if diabtype==1 & audityr>2010 & durationall<6.1 & ethnicnew!=.
      Note: single-variable random-effects specification in id equation; covariance structure set to identity
       
      Performing EM optimization:
       
      Performing gradient-based optimization:
       
      Iteration 0:   log likelihood = -712484.96 
      Iteration 1:   log likelihood = -712484.96 
       
      Computing standard errors:
       
      Mixed-effects ML regression                     Number of obs     =    179,066
      Group variable: id                              Number of groups  =     22,693
       
                                                      Obs per group:
                                                                    min =          1
                                                                    avg =        7.9
                                                                    max =         36
       
                                                      Wald chi2(7)      =   21093.83
      Log likelihood = -712484.96                     Prob > chi2       =     0.0000
       
      ------------------------------------------------------------------------------
          hba1cmol |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
      -------------+----------------------------------------------------------------
             durm2 |  -16.11096   .1356142  -118.80   0.000    -16.37676   -15.84516
             durm8 |   .0841984   .0338622     2.49   0.013     .0178297    .1505672
            durm16 |   .5613997   .0200162    28.05   0.000     .5221687    .6006307
            durm28 |   .1208665   .0103057    11.73   0.000     .1006677    .1410654
            durm38 |    .039808   .0033696    11.81   0.000     .0332037    .0464123
            sexnew |   1.486144   .2000963     7.43   0.000     1.093963    1.878326
        diagagenew |   .5903004   .0248777    23.73   0.000      .541541    .6390598
             _cons |   89.12814   .4500975   198.02   0.000     88.24597    90.01032
      ------------------------------------------------------------------------------
       
      ------------------------------------------------------------------------------
        Random-effects Parameters  |   Estimate   Std. Err.     [95% Conf. Interval]
      -----------------------------+------------------------------------------------
      id: Identity                 |
                        var(_cons) |   201.6421   2.159983      197.4528    205.9204
      -----------------------------+------------------------------------------------
                     var(Residual) |   122.4032   .4388346      121.5462    123.2664
      ------------------------------------------------------------------------------
      LR test vs. linear model: chibar2(01) = 1.1e+05       Prob >= chibar2 = 0.0000
      (est1 stored)
       
      . est store a
       
      . estat ic
       
      Akaike's information criterion and Bayesian information criterion
       
      -----------------------------------------------------------------------------
             Model |        Obs  ll(null)  ll(model)      df         AIC        BIC
      -------------+---------------------------------------------------------------
                 a |    179,066         .    -712485      10     1424990    1425091
      -----------------------------------------------------------------------------
                     Note: N=Obs used in calculating BIC; see [R] BIC note.
       
      . estat icc
       
      Residual intraclass correlation
       
      ------------------------------------------------------------------------------
                             Level |        ICC   Std. Err.     [95% Conf. Interval]
      -----------------------------+------------------------------------------------
      
                                id |   .6222651   .0026972      .6169643    .6275369
      
       
      . eststo: mixed hba1cmol c.durm2##i.sexnew c.durm8##i.sexnew c.durm16##i.sexnew c.durm28##i.sexnew c.durm38##i.sexnew diagagenew || id:, cov(unstr) mle, if diabtype==1 & audityr>2010 & durationall<6.1 & ethnicnew!=.
      Note: single-variable random-effects specification in id equation; covariance structure set to identity
       
      Performing EM optimization:
       
      Performing gradient-based optimization:
       
      Iteration 0:   log likelihood = -712453.89 
      Iteration 1:   log likelihood = -712453.89 
       
      Computing standard errors:
       
      Mixed-effects ML regression                     Number of obs     =    179,066
      Group variable: id                              Number of groups  =     22,693
       
                                                      Obs per group:
                                                                    min =          1
                                                                    avg =        7.9
                                                                    max =         36
       
                                                      Wald chi2(12)     =   21163.73
      Log likelihood = -712453.89                     Prob > chi2       =     0.0000
       
      ---------------------------------------------------------------------------------
             hba1cmol |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
      ----------------+----------------------------------------------------------------
                durm2 |  -15.71934   .1862358   -84.41   0.000    -16.08436   -15.35433
                      |
               sexnew |
              female  |   4.098497   .4847686     8.45   0.000     3.148368    5.048626
                      |
       sexnew#c.durm2 |
              female  |   -.827848   .2716672    -3.05   0.002    -1.360306   -.2953901
                      |
                durm8 |   .1450891   .0464176     3.13   0.002     .0541122    .2360659
                      |
       sexnew#c.durm8 |
              female  |   -.132021    .067851    -1.95   0.052    -.2650065    .0009645
                      |
               durm16 |   .6030136    .027526    21.91   0.000     .5490637    .6569635
                      |
      sexnew#c.durm16 |
              female  |   -.087893   .0400893    -2.19   0.028    -.1664667   -.0093194
                      |
               durm28 |   .1035275   .0141773     7.30   0.000     .0757405    .1313144
                      |
      sexnew#c.durm28 |
              female  |    .036824   .0206396     1.78   0.074    -.0036289    .0772768
                      |
               durm38 |   .0397364   .0046462     8.55   0.000     .0306301    .0488428
                      |
      sexnew#c.durm38 |
              female  |   .0001662   .0067207     0.02   0.980    -.0130061    .0133385
                      |
           diagagenew |   .5903152   .0248798    23.73   0.000     .5415516    .6390787
                _cons |   89.38644   .4054406   220.47   0.000     88.59179    90.18109
      ---------------------------------------------------------------------------------
       
      ------------------------------------------------------------------------------
        Random-effects Parameters  |   Estimate   Std. Err.     [95% Conf. Interval]
      -----------------------------+------------------------------------------------
      id: Identity                 |
                        var(_cons) |   201.6354   2.159868      197.4463    205.9134
      -----------------------------+------------------------------------------------
                     var(Residual) |   122.3559   .4386657      121.4991    123.2187
      ------------------------------------------------------------------------------
      LR test vs. linear model: chibar2(01) = 1.1e+05       Prob >= chibar2 = 0.0000
      (est1 stored)
       
      . est store b
      . estat ic
       
      Akaike's information criterion and Bayesian information criterion
       
      -----------------------------------------------------------------------------
             Model |        Obs  ll(null)  ll(model)      df         AIC        BIC
      -------------+---------------------------------------------------------------
                 b |    179,066         .  -712453.9      15     1424938    1425089
      -----------------------------------------------------------------------------
                     Note: N=Obs used in calculating BIC; see [R] BIC note.
       
      . estat icc
       
      Residual intraclass correlation
       
      ------------------------------------------------------------------------------
                             Level |        ICC   Std. Err.     [95% Conf. Interval]
      -----------------------------+------------------------------------------------
                                id |   .6223482   .0026969       .617048    .6276194
      ------------------------------------------------------------------------------
       
      . lrtest a b
       
      Likelihood-ratio test                                 LR chi2(5)  =     62.14
      (Assumption: a nested in b)                           Prob > chi2 =    0.0000

      Do you have suggestions on the best way to plot the above? And whether margins are not suitable for linear spline models?

      Thanks!
      /Amal

      Comment


      • #4
        So, the challenge is that you need the values of durm* corresponding to interesting values of duration that you want to use along the horizontal axis. The easiest case is if the values you choose are values that actually occur in your data set. So, I'm going to assume that you actually have observations for which duration = 2, 8, 16, 28, and 38. Then you can capture the corresponding values of the spline variables as follows:

        Code:
        foreach d or numlist 2 8 16 28 38 {
            local at_`d'
            foreach v of varlist durm* {
                summ `v' if duration == `d', meanonly
                local at_`d' `at_`d'' `v' = `r(mean)'
            }
        }
        
        margins sexnew, at(`at_2') at(`at_8') at(`at_16') (`at_28') (`at_38') saving(margins_output, replace)
        Then you want to use the new margins_output.dta data set and generate a variable, duration, that corresponds to the corresponding sets of values of durm*. Then you can use regular graphing commands.

        Note: Not tested, there may be typos or other errors here, but this is the gist of it.

        I don't know of a simpler way to do this, though there may be one. If somebody else does, I hope he or she will chime in.

        Comment

        Working...
        X