Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • mixed command: missing standard errors but no error message

    I am trying to model the intensity of a care intervention for an individual (measured in time) as a function of time that a person has been known to a service.

    The data has observations on 1766 individuals, with a mean of 4.3 observations per person, which are structured as follows.
    • id: Person ID
    • intensity: Dependent variable
    • t: Time of snapshot (taken at six monthly intervals)
    • months_open_60: Months individual has been open to service at time of snapshot (mean-centred, has negative values)
    • mo1, mo2, mo3: Orthogonal linear, quadratic and cubic polynomials for months_open_60.
    The head of the data is:

    Code:
         +----------------------------------------------------------------------+
         | id   intensity   t   month~60          mo1          mo2          mo3 |
         |----------------------------------------------------------------------|
      1. |  1   9.6994724   1         12    .45757686   -.72035168   -.84513099 |
      2. |  1   8.8730478   2         18    .62540256    -.5047519   -.94836488 |
      3. |  1   9.3810953   3         24    .79322826     -.231857   -.93265565 |
      4. |  1   9.0255758   4         30    .96105396    .09833303    -.7712714 |
      5. |  1   8.7406567   5         36    1.1288797    .48581818   -.43748023 |
         |----------------------------------------------------------------------|
      6. |  1   9.0777228   6         42    1.2967054    .93059846    .09544976 |
      7. |  1   7.1115121   7         48    1.4645311    1.4326739    .85425047 |
      8. |  1           .   8         54    1.6323568    1.9920444    1.8656538 |
      9. |  2    8.908965   1          6    .28975116   -.87865634   -.64968587 |
     10. |  2   8.6251503   2         12    .45757686   -.72035168   -.84513099 |
    Each id has between 1 and 8 values for intensity. All values for each individual are contiguous, reflecting the time during the 4 years (8 snapshots at six-month intervals) that they were known to the service. Each snapshot time has around 1,000 observed values.

    The value of t, the snapshot time, is arbitrary as far as the individual is concerned so I am using the months they have been open to the service as the predictor variable for time. This means that although the measurement occasions were fixed in time, the values of my time variable (mo1, mo2, mo3) vary across individuals at those times. For this reason, I want to run this as a mixed model, with random time effects to reflect unobserved individual factors, and add some fixed covariates related to demographics later.

    The variance of the intensity varies across time, and the correlation between observations is greater when observations are closer together in time, so I am using the covariance(unstructured) option. This works unproblematically for a fixed quadratic model with a random linear slope, i.e.

    \[ y_{ij} = \beta_{0j} + \beta_{1j} mo1 + \beta_2 mo2 + e_{ij} \\
    \beta_{0j} = \beta_0 + u_{0j} \\
    \beta_{1j} = \beta_1 + u_{1j} \]

    Or in Stata:

    Code:
    mixed intensity mo1 mo2 || id: mo1, mle variance covariance(unstructured)
    That produces this output:

    Code:
    Performing gradient-based optimization:
    Iteration 0:   log likelihood = -13511.573  
    Iteration 1:   log likelihood = -13507.184  
    Iteration 2:   log likelihood =  -13507.15  
    Iteration 3:   log likelihood =  -13507.15  
    
    Computing standard errors ...
    
    Mixed-effects ML regression                     Number of obs     =      7,600
    Group variable: id                              Number of groups  =      1,760
                                                    Obs per group:
                                                                  min =          1
                                                                  avg =        4.3
                                                                  max =          8
                                                    Wald chi2(2)      =      14.92
    Log likelihood =  -13507.15                     Prob > chi2       =     0.0006
    
    ------------------------------------------------------------------------------
       intensity | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
    -------------+----------------------------------------------------------------
             mo1 |  -.0823819   .0255335    -3.23   0.001    -.1324266   -.0323373
             mo2 |   .0267544   .0197757     1.35   0.176    -.0120051     .065514
           _cons |   7.256383   .0271316   267.45   0.000     7.203206     7.30956
    ------------------------------------------------------------------------------
    
    ------------------------------------------------------------------------------
      Random-effects parameters  |   Estimate   Std. err.     [95% conf. interval]
    -----------------------------+------------------------------------------------
    id: Unstructured             |
                        var(mo1) |   .2055391   .0388611      .1418922    .2977355
                      var(_cons) |   .6037068   .0448357      .5219268    .6983007
                  cov(mo1,_cons) |   .0787516   .0252527      .0292572     .128246
    -----------------------------+------------------------------------------------
                   var(Residual) |    1.58523   .0308387      1.525925     1.64684
    ------------------------------------------------------------------------------
    LR test vs. linear model: chi2(3) = 928.23                Prob > chi2 = 0.0000
    However, a likelihood ratio test demonstrates that this model is not preferable over a linear model without the quadratic fixed effect. Intensity is not linear for individuals over time, so I suspect this is because a fixed quadratic effect cannot capture the large variation in intensity. I tried adding a full quadratic random effect, i.e.:

    \[ y_{ij} = \beta_{0j} + \beta_{1j} mo1 + \beta_{2j} mo2 + e_ij \\
    \beta_{0j} = \beta_0 + u_{0j} \\
    \beta_{1j} = \beta_1 + u_{1j} \\
    \beta_{2j} = \beta_1 + u_{2j} \]

    Or in Stata code:

    Code:
    mixed intensity mo1 mo2 || id: mo1 mo2, mle variance covariance(unstructured)
    However, I am confused by the output:

    Code:
    Performing gradient-based optimization:
    Iteration 0:   log likelihood = -13626.217  
    Iteration 1:   log likelihood = -13605.349  
    Iteration 2:   log likelihood = -13601.943  
    Iteration 3:   log likelihood = -13601.464  
    Iteration 4:   log likelihood = -13601.461  
    Iteration 5:   log likelihood = -13601.461  
    
    Computing standard errors ...
    
    Mixed-effects ML regression                     Number of obs     =      7,692
    Group variable: id                              Number of groups  =      1,765
                                                    Obs per group:
                                                                  min =          1
                                                                  avg =        4.4
                                                                  max =          8
                                                    Wald chi2(2)      =      11.98
    Log likelihood = -13601.461                     Prob > chi2       =     0.0025
    
    ------------------------------------------------------------------------------
       intensity | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
    -------------+----------------------------------------------------------------
             mo1 |  -.0789441   .0252904    -3.12   0.002    -.1285123   -.0293759
             mo2 |   .0151725   .0221992     0.68   0.494    -.0283372    .0586822
           _cons |   7.246859   .0271523   266.90   0.000     7.193641    7.300077
    ------------------------------------------------------------------------------
    
    ------------------------------------------------------------------------------
      Random-effects parameters  |   Estimate   Std. err.     [95% conf. interval]
    -----------------------------+------------------------------------------------
    id: Unstructured             |
                        var(mo1) |    .478652          .             .           .
                        var(mo2) |   .1620386          .             .           .
                      var(_cons) |   .3203482          .             .           .
                    cov(mo1,mo2) |   .0301161          .             .           .
                  cov(mo1,_cons) |     .06083          .             .           .
                  cov(mo2,_cons) |   -.219922          .             .           .
    -----------------------------+------------------------------------------------
                   var(Residual) |   1.445289          .             .           .
    ------------------------------------------------------------------------------
    LR test vs. linear model: chi2(6) = 1086.28               Prob > chi2 = 0.0000
    The model converges after 4 iterations, and I do not get an error that Stata is unable to produce a standard error for the random parts of the model. However, the standard errors are missing.

    I cannot use lrtest as the standard errors are missing, but compared to the previous model 2* change in log-likehood is 137.304, on 3 degrees of freedom, which would indicate that this model is preferable to the previous one.

    However, can I use this log-likehood figure? My sense is that the missing standard errors mean I cannot use the model - that somehow it has failed - but I do not really understand it.

    I am pretty sure that this is happening because of the use of the months open as the time variable. If I replace it with t, the snapshot period, then I do not have this issue. However, I have removed the number of months with very few observations.

    Code:
    . tab months_open60
    
    months_open |
             60 |      Freq.     Percent        Cum.
    ------------+-----------------------------------
            -54 |        481        6.49        6.49
            -48 |        465        6.27       12.76
            -42 |        436        5.88       18.64
            -36 |        402        5.42       24.06
            -30 |        384        5.18       29.23
            -24 |        360        4.85       34.09
            -18 |        365        4.92       39.01
            -12 |        355        4.79       43.80
             -6 |        357        4.81       48.61
              0 |        349        4.71       53.32
              6 |        399        5.38       58.70
             12 |        467        6.30       64.99
             18 |        545        7.35       72.34
             24 |        507        6.84       79.18
             30 |        462        6.23       85.41
             36 |        416        5.61       91.02
             42 |        362        4.88       95.90
             48 |        304        4.10      100.00
    ------------+-----------------------------------
          Total |      7,416      100.00
    But I still get the same issue. What is going on here? What do the missing standard errors actually mean? Is my diagnosis on the right lines? Is there a way of keeping a meaningful time variable and avoiding this issue? I'd be very grateful if anyone could shed any light on this.
    Last edited by Sam Rickman; 16 May 2022, 08:58.

  • #2
    At a hunch, I think there's something up with your Hessian matrix, which might have been hinted at if you show the full output following each -mixed- command. The final Hessian is probably not definite. Some things you may consider:
    • simplify the model by removing unnecessary random effects, or alternatively, simplify the covariance structure (such as independent);
    • use REML instead of MLE and compare the final models using the log-likelihood (or relatedly, AIC/BIC-type measures) as long as the fixed effects components remain the same;
    • perhaps you might find a solution by augmenting the maximization procedure by using the -difficult- option (this is no guarantee, but sometimes is helpful).
    This is discussed somewhat (in the SAS context) here and see also this paper.

    Comment


    • #3
      Thank you very much for this response, Leonardo. I have posted the full output from both commands below, if that sheds any light on it.

      The blog post you linked to suggests that problems can occur when the random effect is unnecessary, i.e. if there is actually very little individual variance so no need for a random effect. I don't know if it's reasonable to infer from that if there are problems, it's because there's no need for a random effect?

      • simplify the model by removing unnecessary random effects, or alternatively, simplify the covariance structure (such as independent);
      Yes I want to remove unnecessary random effects, but I don't know if I can conclude that this random effect is unnecessary from the output. I can also run "estat ic" afterwards, which I don't think I would be able to do if the model had failed with an error? The BIC and AIC are higher than the simpler model, so if I can trust that output (?) then I can conclude that the second random effect is unnecessary.

      • use REML instead of MLE and compare the final models using the log-likelihood (or relatedly, AIC/BIC-type measures) as long as the fixed effects components remain the same;
      Thanks good thought but afraid it has the exact same issue.

      • perhaps you might find a solution by augmenting the maximization procedure by using the -difficult- option (this is no guarantee, but sometimes is helpful).
      Again thanks but did not change the situation.

      I include the full output below.

      Code:
      . mixed intensity mo1 mo2 || id: mo1, mle variance covariance(unstructured)
      
      Performing EM optimization ...
      
      Performing gradient-based optimization: 
      Iteration 0:   log likelihood = -13511.573  
      Iteration 1:   log likelihood = -13507.184  
      Iteration 2:   log likelihood =  -13507.15  
      Iteration 3:   log likelihood =  -13507.15  
      
      Computing standard errors ...
      
      Mixed-effects ML regression                     Number of obs     =      7,600
      Group variable: id                              Number of groups  =      1,760
                                                      Obs per group:
                                                                    min =          1
                                                                    avg =        4.3
                                                                    max =          8
                                                      Wald chi2(2)      =      14.92
      Log likelihood =  -13507.15                     Prob > chi2       =     0.0006
      
      ------------------------------------------------------------------------------
         intensity | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
      -------------+----------------------------------------------------------------
               mo1 |  -.0823819   .0255335    -3.23   0.001    -.1324266   -.0323373
               mo2 |   .0267544   .0197757     1.35   0.176    -.0120051     .065514
             _cons |   7.256383   .0271316   267.45   0.000     7.203206     7.30956
      ------------------------------------------------------------------------------
      
      ------------------------------------------------------------------------------
        Random-effects parameters  |   Estimate   Std. err.     [95% conf. interval]
      -----------------------------+------------------------------------------------
      id: Unstructured             |
                          var(mo1) |   .2055391   .0388611      .1418922    .2977355
                        var(_cons) |   .6037068   .0448357      .5219268    .6983007
                    cov(mo1,_cons) |   .0787516   .0252527      .0292572     .128246
      -----------------------------+------------------------------------------------
                     var(Residual) |    1.58523   .0308387      1.525925     1.64684
      ------------------------------------------------------------------------------
      LR test vs. linear model: chi2(3) = 928.23                Prob > chi2 = 0.0000
      
      Note: LR test is conservative and provided only for reference.
      
      
      
      
      
      . mixed intensity mo1 mo2 || id: mo1 mo2, mle variance covariance(unstructured) difficult
      
      Performing EM optimization ...
      
      Performing gradient-based optimization: 
      Iteration 0:   log likelihood = -13465.126  
      Iteration 1:   log likelihood = -13444.909  
      Iteration 2:   log likelihood = -13441.834  
      Iteration 3:   log likelihood = -13441.324  
      Iteration 4:   log likelihood = -13441.317  
      Iteration 5:   log likelihood = -13441.317  
      
      Computing standard errors ...
      
      Mixed-effects ML regression                     Number of obs     =      7,600
      Group variable: id                              Number of groups  =      1,760
                                                      Obs per group:
                                                                    min =          1
                                                                    avg =        4.3
                                                                    max =          8
                                                      Wald chi2(2)      =      11.30
      Log likelihood = -13441.317                     Prob > chi2       =     0.0035
      
      ------------------------------------------------------------------------------
         intensity | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
      -------------+----------------------------------------------------------------
               mo1 |  -.0711235   .0257854    -2.76   0.006     -.121662   -.0205849
               mo2 |   .0243523   .0229367     1.06   0.288    -.0206027    .0693073
             _cons |   7.250146   .0272169   266.38   0.000     7.196801     7.30349
      ------------------------------------------------------------------------------
      
      ------------------------------------------------------------------------------
        Random-effects parameters  |   Estimate   Std. err.     [95% conf. interval]
      -----------------------------+------------------------------------------------
      id: Unstructured             |
                          var(mo1) |   .4925663          .             .           .
                          var(mo2) |   .1719592          .             .           .
                        var(_cons) |   .3084749          .             .           .
                      cov(mo1,mo2) |   .0409253          .             .           .
                    cov(mo1,_cons) |   .0596388          .             .           .
                    cov(mo2,_cons) |  -.2203869          .             .           .
      -----------------------------+------------------------------------------------
                     var(Residual) |   1.445689          .             .           .
      ------------------------------------------------------------------------------
      LR test vs. linear model: chi2(6) = 1059.90               Prob > chi2 = 0.0000
      
      Note: LR test is conservative and provided only for reference.
      I am very much leaning towards removing the random effect on the squared term. I would just feel more confident in doing so if either a) I could say with certainty that the model had not converged, or b) I could say with certainty that the AIC and BIC produced after running the model were reliable (as they compare unfavourably with the simpler model). Do you feel from the output posted that I can say either?

      Comment


      • #4
        I’m not really sure what’s best in this case. You mention these are orthogonal polynomials, so you may consider an independent covariance structure as a simplifying strategy.

        Another more common approach would be to use the untransformed month variable, considering only up to a random quadratic term. You could make plots of selected individuals if the fitted vs observed to see how adequately or not the model fits.

        Comment


        • #5
          Originally posted by Sam Rickman View Post
          The variance of the intensity varies across time, and the correlation between observations is greater when observations are closer together in time, so I am using the covariance(unstructured) option.
          Temporal autocorrelation of the observations' errors is modeled in the covariance structure of the residuals, and not as covariance of intercept and slope coefficients of your higher-level random effects.

          So, try something like this instead.*
          Code:
          mixed intensity c.(mo1 mo2) || id: mo1 mo2, covariance(independent) ///
              residuals(unstructured, t(t))
          I'd guess from the particulars of your survey that autocorrelation won't be great enough to affect the regression coefficients appreciably, and can safely be ignored, especially if you find it too time-consuming to fit a model that includes an unstructured residual covariance matrix with 36 unique elements.

          * The code reflects my agreement with Leonardo's advice to switch to a simpler covariance structure of your random person-level random effects.

          Comment


          • #6

            Thanks a lot for these responses. I was misunderstanding something. There are two covariance matrices of interest. The first is the covariance matrix of each time point to every other time point. Joseph I think this is what you mean when you are talking about the unstructured residual matrix with 36 unique elements. For the entirety of the data it looks like:

            \[ \begin{bmatrix}
            1.71368 & & & & & & & \\
            0.756977 & 1.57591 & & & & & & \\
            0.529473 & 0.786411 & 2.08232 & & & & & \\
            0.491731 & 0.558595 & 0.933849 & 2.13161 & & & & \\
            0.276758 & 0.342848 & 0.638271 & 0.821977 & 1.58599 & & & \\
            0.170209 & 0.260075 & 0.510564 & 0.435605 & 0.899372 & 1.92696 & & \\
            0.205027 & 0.248522 & 0.631041 & 0.556762 & 0.742223 & 1.02729 & 2.1866 & \\
            0.060675 & 0.164207 & 0.300921 & 0.441473 & 0.58498 & 0.639033 & 1.20417 & 2.22011
            \end{bmatrix} \]

            And then there is the covariance matrix of the random effects to each other, which with a full quadratic random effects model is:

            \[ \begin{bmatrix}
            \sigma^2_{u0} \\
            \sigma_{u0,u1} & \sigma^2_{u1} \\
            \sigma_{u0,u2} & \sigma_{u1,u2} & \sigma^2_{u2}
            \end{bmatrix} \]

            I had thought that the -covariance()- option was specifying the former, but I realise after your helpful responses that it is specifying the latter.

            So if we specify independent for the random effects, that means that all the variances on the diagonal are independently estimated, and all the covariances are zero? My only concern is the second assumption. I am interested in the sign of covariance of the intercept and slope random effects, which I hope to interpret as saying something about unobserved individual time-invariant factors. Something like, people who have more intense intervention at the intercept (which is mean-centred, so the middle of their treatment), are likely to increase at a greater/lesser rate than others. Of course it becomes more complicated with more random effects added, especially if the signs differ.

            I have tried your code. It does converge! Allowing all the variance in the errors seems to essentially remove the random effects around the intercept and slope (they are both almost zero). I include the full output below. I think you are right to say that it is not necessary to do it this way, although Stata does not seem to agree. The model has 35 more parameters than when the residuals are not estimated separately (also pasted below), and the change in log-likehood is 341.13. Consulting my chi-squared distribution table at 35 d.f. indicates it needs to be > 49.802, so I would reject the null, but when I run -lrtest- Stata suggests I should accept it (also pasted below).

            Code:
            // Unstructured residuals by time
            . mixed intensity c.(mo1 mo2) || id: mo1 mo2, covariance(independent) ///
            >     residuals(unstructured, t(t))
            
            Obtaining starting values by EM ...
            
            Performing gradient-based optimization:
            Iteration 0:   log likelihood = -13500.874  (not concave)
            Iteration 1:   log likelihood = -13336.922  (not concave)
            Iteration 2:   log likelihood = -13321.947  
            Iteration 3:   log likelihood = -13313.636  (not concave)
            Iteration 4:   log likelihood =  -13313.09  (not concave)
            Iteration 5:   log likelihood = -13312.852  (not concave)
            Iteration 6:   log likelihood = -13312.771  (not concave)
            Iteration 7:   log likelihood = -13312.728  (not concave)
            Iteration 8:   log likelihood = -13312.648  (not concave)
            Iteration 9:   log likelihood = -13312.639  (not concave)
            Iteration 10:  log likelihood = -13312.631  (not concave)
            Iteration 11:  log likelihood = -13312.628  (not concave)
            Iteration 12:  log likelihood = -13312.627  (not concave)
            Iteration 13:  log likelihood = -13312.626  (not concave)
            Iteration 14:  log likelihood = -13312.626  (not concave)
            Iteration 15:  log likelihood = -13312.626  (not concave)
            Iteration 16:  log likelihood = -13312.625  (not concave)
            Iteration 17:  log likelihood = -13312.625  
            Iteration 18:  log likelihood = -13312.625  
            
            Computing standard errors ...
            
            Mixed-effects ML regression                     Number of obs     =      7,600
            Group variable: id                              Number of groups  =      1,760
                                                            Obs per group:
                                                                          min =          1
                                                                          avg =        4.3
                                                                          max =          8
                                                            Wald chi2(2)      =       6.39
            Log likelihood = -13312.625                     Prob > chi2       =     0.0409
            
            ------------------------------------------------------------------------------
               intensity | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
            -------------+----------------------------------------------------------------
                     mo1 |  -.0482098   .0249902    -1.93   0.054    -.0971897    .0007701
                     mo2 |   .0233993   .0223119     1.05   0.294    -.0203313    .0671299
                   _cons |   7.276759   .0259349   280.58   0.000     7.225928    7.327591
            ------------------------------------------------------------------------------
            
            ------------------------------------------------------------------------------
              Random-effects parameters  |   Estimate   Std. err.     [95% conf. interval]
            -----------------------------+------------------------------------------------
            id: Independent              |
                                var(mo1) |   3.15e-14   1.47e-11             0           .
                                var(mo2) |   8.34e-14   3.69e-11             0           .
                              var(_cons) |   .6191353   2.682196      .0001271    3015.297
            -----------------------------+------------------------------------------------
            Residual: Unstructured       |
                                 var(e1) |   1.618817    2.68463      .0627442    41.76591
                                 var(e2) |   1.508003   2.683596      .0460925    49.33716
                                 var(e3) |   1.821116    2.68383      .1013672     32.7173
                                 var(e4) |   1.749616   2.683767      .0865505    35.36844
                                 var(e5) |    1.63688   2.683493      .0658508     40.6886
                                 var(e6) |   1.494333   2.683446       .044249    50.46508
                                 var(e7) |   2.153964   2.684761      .1871876    24.78562
                                 var(e8) |   2.009101   2.685673      .1462699    27.59615
                              cov(e1,e2) |   .4017978   2.683458     -4.857683    5.661278
                              cov(e1,e3) |  -.0467038   2.683399     -5.306069    5.212662
                              cov(e1,e4) |  -.2737609   2.683652     -5.533622      4.9861
                              cov(e1,e5) |   -.419534    2.68382     -5.679725    4.840657
                              cov(e1,e6) |   -.465271   2.683899     -5.725616    4.795074
                              cov(e1,e7) |  -.3552162   2.684599     -5.616934    4.906502
                              cov(e1,e8) |  -.5917404   2.687019     -5.858201     4.67472
                              cov(e2,e3) |   .3294486   2.682931     -4.928999    5.587897
                              cov(e2,e4) |  -.1137927   2.682955     -5.372288    5.144702
                              cov(e2,e5) |  -.3648367   2.682967     -5.623356    4.893683
                              cov(e2,e6) |  -.4468693   2.682977     -5.705408     4.81167
                              cov(e2,e7) |  -.4079178   2.683517     -5.667515    4.851679
                              cov(e2,e8) |  -.4432057   2.684428     -5.704587    4.818176
                              cov(e3,e4) |   .3889085   2.683094     -4.869858    5.647675
                              cov(e3,e5) |  -.0422437   2.683079     -5.300982    5.216495
                              cov(e3,e6) |  -.0899681   2.683119     -5.348785    5.168848
                              cov(e3,e7) |   .0716543    2.68372      -5.18834    5.331648
                              cov(e3,e8) |    -.27024   2.684716     -5.532187    4.991707
                              cov(e4,e5) |   .3391258   2.682943     -4.919347    5.597598
                              cov(e4,e6) |  -.0345595   2.682964     -5.293071    5.223952
                              cov(e4,e7) |   .0086894   2.683439     -5.250755    5.268134
                              cov(e4,e8) |  -.2450761   2.683923     -5.505469    5.015317
                              cov(e5,e6) |   .4183339   2.682944     -4.840139    5.676807
                              cov(e5,e7) |   .2297353   2.683348     -5.029531    5.489001
                              cov(e5,e8) |   .0700127   2.683799     -5.190136    5.330162
                              cov(e6,e7) |   .5249811   2.683137     -4.733871    5.783833
                              cov(e6,e8) |   .2187205    2.68339     -5.040627    5.478068
                              cov(e7,e8) |   .7795099   2.684219     -4.481463    6.040483
            ------------------------------------------------------------------------------
            LR test vs. linear model: chi2(38) = 1317.28              Prob > chi2 = 0.0000
            
            Note: LR test is conservative and provided only for reference.
            
            // Independent covariance matrix of random effects without unstructured residuals by time
            
            . mixed intensity c.(mo1 mo2) || id: mo1 mo2, covariance(independent)
            
            Performing EM optimization ...
            
            Performing gradient-based optimization:
            Iteration 0:   log likelihood = -13500.874  
            Iteration 1:   log likelihood =  -13483.88  
            Iteration 2:   log likelihood = -13483.194  
            Iteration 3:   log likelihood = -13483.192  
            Iteration 4:   log likelihood = -13483.192  
            
            Computing standard errors ...
            
            Mixed-effects ML regression                     Number of obs     =      7,600
            Group variable: id                              Number of groups  =      1,760
                                                            Obs per group:
                                                                          min =          1
                                                                          avg =        4.3
                                                                          max =          8
                                                            Wald chi2(2)      =      11.58
            Log likelihood = -13483.192                     Prob > chi2       =     0.0031
            
            ------------------------------------------------------------------------------
               intensity | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
            -------------+----------------------------------------------------------------
                     mo1 |    -.07209   .0253706    -2.84   0.004    -.1218154   -.0223645
                     mo2 |   .0305336   .0222943     1.37   0.171    -.0131625    .0742296
                   _cons |   7.257848   .0265951   272.90   0.000     7.205722    7.309973
            ------------------------------------------------------------------------------
            
            ------------------------------------------------------------------------------
              Random-effects parameters  |   Estimate   Std. err.     [95% conf. interval]
            -----------------------------+------------------------------------------------
            id: Independent              |
                                var(mo1) |   .1078832   .0435522      .0489019    .2380028
                                var(mo2) |   .1764656   .0276881      .1297488     .240003
                              var(_cons) |    .594803   .0491476      .5058709    .6993694
            -----------------------------+------------------------------------------------
                           var(Residual) |   1.504296    .031062      1.444631    1.566425
            ------------------------------------------------------------------------------
            
            // Likelihood-ratio test
            
            LR test vs. linear model: chi2(3) = 976.15                Prob > chi2 = 0.0000
            
            . lrtest josephcoveney simplejc
            
            Likelihood-ratio test
            Assumption: simplejc nested within josephcoveney
            
            LR chi2(35) = 341.13
            Prob > chi2 = 0.0000
            Leonardo, thanks for your suggestion about the untransformed month variable. I did try that but it did not change the outcome.

            Overall, because I am interested in the intercept/slope covariance, I think it might make sense to stick with an unstructured covariance matrix and reduce the number of random effects. Do you think this seems reasonable? Thanks very much for all your help so far.

            Comment


            • #7
              I have just realised that I was comparing the wrong value in the likelihood-ratio test for the following models:

              Code:
              // Model 1: Unstructured covariance matrix for e()
              mixed intensity c.(mo1 mo2) || id: mo1 mo2, covariance(independent) ///
                  residuals(unstructured, t(t))
              
              // Model 2: Single variance for e()
              mixed intensity c.(mo1 mo2) || id: mo1 mo2, covariance(independent)
              I mistakenly thought the change in log-likelihood was 35, but that is the number of changes in the degrees of freedom. In fact the change in log-likelihood is 341.134, which means that I should prefer that model over the simpler model, at least using that test.

              Comparing AIC would seem to agree that Model 1 is preferable:

              Code:
              // AIC for Model 1:
              Akaike's information criterion and Bayesian information criterion
              
              -----------------------------------------------------------------------------
                     Model |          N   ll(null)  ll(model)      df        AIC        BIC
              -------------+---------------------------------------------------------------
                          |      7,600          .  -13312.63      42   26709.25   27000.56
              -----------------------------------------------------------------------------
              
              // AIC for Model 2:
              
              Akaike's information criterion and Bayesian information criterion
              
              -----------------------------------------------------------------------------
                     Model |          N   ll(null)  ll(model)      df        AIC        BIC
              -------------+---------------------------------------------------------------
                         . |      7,600          .  -13483.19       7   26980.38   27028.94
              -----------------------------------------------------------------------------
              The question remains whether I should prefer the complex model over a model like:

              Code:
              mixed intensity mo1 mo2 || id: mo1, mle variance covariance(unstructured)
              I can't compare these with likelihood-ratio test because they are non-nested, but again the more complex model has a substantially lower AIC:

              Code:
              // AIC for covariance(unstructured) :
              
              /*
              
              Akaike's information criterion and Bayesian information criterion
              
              -----------------------------------------------------------------------------
                     Model |          N   ll(null)  ll(model)      df        AIC        BIC
              -------------+---------------------------------------------------------------
                randomp1f2 |      7,600          .  -13507.15       7    27028.3   27076.85
              -----------------------------------------------------------------------------
              */

              Overall then I think that ugly as it is to be estimating an additional 35 parameters, and although this makes interpretation more complex, allowing the matrix of residuals to be unstructured by time with -residuals(unstructured, t(t))- does seem to be the best model in terms of fit.

              Comment


              • #8
                I'm not sure why it didn't dawn on me yesterday, but with only two levels as you have you would typically leave the random effects equation empty and include the time variable as categories, that is, in the manner of a MANOVA. So, if you're going to be fitting an unstructured covariance matrix to the residuals, then it would usually be something along these lines:
                Code:
                mixed intensity i.t || id: , noconstant ///
                    residuals(unstructured, t(t))
                The fact that your coefficients for the two slope random effects collapsed to zero tips you off to this. (The random intercept will be just distributed among the diagonal elements.)

                Originally posted by Sam Rickman View Post
                I am interested in the sign of covariance of the intercept and slope random effects, which I hope to interpret as saying something about unobserved individual time-invariant factors. Something like, people who have more intense intervention at the intercept (which is mean-centred, so the middle of their treatment), are likely to increase at a greater/lesser rate than others.
                Unfortunately, modeling the residuals like that (with an empty random effects equation) isn't going to help you address your research question. For what you're interested in, perhaps you could simply model the time variable as continuous in both the fixed effects side and random effects side. Something like this:
                Code:
                mixed intensity c.t || id: t, covariance(unstructured)
                Also, if you are interested in the relationship between initial values and time slope, then perhaps you wouldn't want to center the time variable, because that will move the intercept from the beginning (where the initial differences existed) to the middle (where everyone is more similar as their trajectories cross) and might attenuate the (typically negative) covariance coefficient between intercept and time slope random effects.

                Comment


                • #9
                  Thanks a lot for this, Joseph. I think you're right that the approach without a random slope does not really answer my research question about unobserved time-invariant individual factors.

                  Regarding your other suggestion, I think the issue that I am having is that this would work with the time variable -t-, but as -t- just represents the period that the snapshot happened to be, it is not as useful in terms of interpretability. For example, let's say that I found that the intercept/slope covariance at t=0 was positive, this would mean that people who had more intensity of care during the first period of the snapshot were likely to increase at a higher rate. However, t=0 could be any point in a person's treatment - people may have already been known to the service for years before the first snapshot.

                  So I want to use a time variable, months_open, which is how many years a person has been open to the service. However, during the period of the snapshot, most people are already known to the service. I do not want to have months_open = 0 as the intercept, because it's almost always outside the range of the observed data. So I have mean centred it.

                  So taking your code:

                  Code:
                  mixed intensity c.t || id: t, covariance(unstructured)
                  Edit: Apologies - I posted before I was finished.

                  If I take that code, I cannot run the same type of analysis using a categorical -mo1-, because it is an orthogonal polynomial. If I updated it to use months_open60, I still could not do it as it would have negative values. If I used months_open (not mean-centred), I would be able to do it, but the intercept/slope covariance would be out of the observed range of values so not that useful.

                  I guess my question is, do I want to stick with trying to quantify the unobserved, time-invariant individual factors and accept I cannot have a quadratic random effect, and the model might not fit as as well?

                  Or should I go with a better fitting model, with unstructured residuals by time, which could give me something useful to say about the covariates but not about the unobserved factors?

                  If my reasoning is correct this is not really a question you can answer, but one I need to think about. However if I have missed the point of your message please let me know!
                  Last edited by Sam Rickman; 18 May 2022, 03:06. Reason: Edit: Apologies - I posted before I was finished.

                  Comment


                  • #10
                    Originally posted by Sam Rickman View Post
                    However, t=0 could be any point in a person's treatment - people may have already been known to the service for years before the first snapshot.

                    So I want to use a time variable, months_open, which is how many years a person has been open to the service.
                    Perhaps you can have your cake and eat it, too: use t (uncentered) in order to examine the covariance between intercept (within the observation interval) and slope random effects, and also include uncentered months_open (duration a person has been open to the service as of entry into the study observation period) in the fixed effects side as you would any other time-invariant personal characteristic, that is, handle it as you would other demographic characteristics present at study enrollment.

                    So, the model would be something like:
                    Code:
                    mixed intensity c.(t months_open) || id: t, covariance(unstructured)

                    Comment

                    Working...
                    X