MIxed with irregular follow-up timepoints

Laura Myles

Join Date: Jun 2018

Posts: 155
#1

MIxed with irregular follow-up timepoints

07 Jun 2023, 03:24

Dear Listers,

I have data collected at 7 time points: baseline, 3 weeks, 6 weeks, 9 weeks, 12 weeks, 18 and 24 weeks - see data example below.

At each point, participants are asked to rate their wellbeing and I am interested in assessing how it changes over time. For this I plan to use a mixed model regressing welbeing scores on time. As participants are initially asked to rate wellbeing evert 3 months and then every 6 months, I am unsure how to best model time.

I was originally planning to model it as a categorical variable but not sure how to best reflect the change in follow-up assessment - can I still set up the model as

Code:

mixed wellbeing i.time || id:

Or if time needs to be recoded somehow?

Code:

* Example clear input int id byte time double wellbeing 1 0 2 1 3 2 1 6 6 1 9 3 1 12 2 1 18 4 1 24 3 2 0 7 2 3 3 2 6 3 2 9 . 2 12 2 2 18 1 2 24 2 3 0 7 3 3 7 3 6 5 3 9 . 3 12 7 3 18 4 3 24 5 end
Tags: None
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17736
#2

07 Jun 2023, 04:03

Laura:
you basically have a panel with missing values as far as the wellbeing is concerned.
However, the main issue seems to rest on the fact that the other viariables are still measured according to the original weekly scale.
Therefore, your original sample size will be reduced accordingly.

Kind regards,
Carlo
(Stata 19.0)
Comment
Laura Myles

Join Date: Jun 2018

Posts: 155
#3

07 Jun 2023, 05:28

Hi Carlo,

Sorry could you say a bit more? I am unsure if you are suggesting I should stick to the original weekly interval so that 15 and 21 weeks would be missing - is that correct? Could I alternatively only focus on baseline, 6, 12, 18 and 24 months?

I also noticed that you mentioned panel data. I had read that for panel data -mixed- and -xtreg- are equivalent so I opted for using -mixed- but was wondering if you had a take on this.
Comment

Carlo Lazzaro

Join Date: Apr 2014
Posts: 17736

07 Jun 2023, 09:34

Laura:
1) yes, I meant you stick with the "original weekly interval so that 15 and 21 weeks would be missing";
2) the equivalence you mention is correct as long as you compare -mixed- with -xtreg, mle- (a -re- ML estimator):

Code:

. use "https://www.stata-press.com/data/r17/pig.dta"
(Longitudinal analysis of pig weights)

. mixed weight i.week|| id:

Performing EM optimization ...

Performing gradient-based optimization:
Iteration 0:   log likelihood = -1007.0675  
Iteration 1:   log likelihood = -1007.0675  

Computing standard errors ...

Mixed-effects ML regression                     Number of obs     =        432
Group variable: id                              Number of groups  =         48
                                                Obs per group:
                                                              min =          9
                                                              avg =        9.0
                                                              max =          9
                                                Wald chi2(8)      =   26412.22
Log likelihood = -1007.0675                     Prob > chi2       =     0.0000

------------------------------------------------------------------------------
      weight | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
        week |
          2  |   6.760417   .4187014    16.15   0.000     5.939777    7.581056
          3  |   13.84375   .4187014    33.06   0.000     13.02311    14.66439
          4  |     19.375   .4187014    46.27   0.000     18.55436    20.19564
          5  |   25.13542   .4187014    60.03   0.000     24.31478    25.95606
          6  |   31.42708   .4187014    75.06   0.000     30.60644    32.24772
          7  |    37.4375   .4187014    89.41   0.000     36.61686    38.25814
          8  |   44.28125   .4187014   105.76   0.000     43.46061    45.10189
          9  |   50.19792   .4187014   119.89   0.000     49.37728    51.01856
             |
       _cons |   25.02083   .6298893    39.72   0.000     23.78627    26.25539
------------------------------------------------------------------------------

------------------------------------------------------------------------------
  Random-effects parameters  |   Estimate   Std. err.     [95% conf. interval]
-----------------------------+------------------------------------------------
id: Identity                 |
                  var(_cons) |   14.83704    3.12421      9.819998     22.4173
-----------------------------+------------------------------------------------
               var(Residual) |   4.207462   .3036474      3.652498    4.846747
------------------------------------------------------------------------------
LR test vs. linear model: chibar2(01) = 484.84        Prob >= chibar2 = 0.0000


. xtset id week

Panel variable: id (strongly balanced)
 Time variable: week, 1 to 9
         Delta: 1 unit

. xtreg weight i.week, mle

Fitting constant-only model:
Iteration 0:   log likelihood = -1827.2124
Iteration 1:   log likelihood = -1827.2118

Fitting full model:
Iteration 0:   log likelihood = -1008.0493
Iteration 1:   log likelihood = -1007.0894
Iteration 2:   log likelihood = -1007.0675
Iteration 3:   log likelihood = -1007.0675

Random-effects ML regression                        Number of obs    =     432
Group variable: id                                  Number of groups =      48

Random effects u_i ~ Gaussian                       Obs per group:
                                                                 min =       9
                                                                 avg =     9.0
                                                                 max =       9

                                                    LR chi2(8)       = 1640.29
Log likelihood = -1007.0675                         Prob > chi2      =  0.0000

------------------------------------------------------------------------------
      weight | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
        week |
          2  |   6.760417   .4187015    16.15   0.000     5.939777    7.581056
          3  |   13.84375   .4187015    33.06   0.000     13.02311    14.66439
          4  |     19.375   .4187015    46.27   0.000     18.55436    20.19564
          5  |   25.13542   .4187015    60.03   0.000     24.31478    25.95606
          6  |   31.42708   .4187015    75.06   0.000     30.60644    32.24772
          7  |    37.4375   .4187015    89.41   0.000     36.61686    38.25814
          8  |   44.28125   .4187015   105.76   0.000     43.46061    45.10189
          9  |   50.19792   .4187015   119.89   0.000     49.37728    51.01856
             |
       _cons |   25.02083    .629889    39.72   0.000     23.78627    26.25539
-------------+----------------------------------------------------------------
    /sigma_u |   3.851886   .4055422                      3.133686    4.734688
    /sigma_e |    2.05121   .0740167                      1.911151    2.201533
         rho |   .7790719    .038439                      .6968047    .8468207
------------------------------------------------------------------------------
LR test of sigma_u=0: chibar2(01) = 484.84             Prob >= chibar2 = 0.000

. xtreg weight i.week

Random-effects GLS regression                   Number of obs     =        432
Group variable: id                              Number of groups  =         48

R-squared:                                      Obs per group:
     Within  = 0.0000                                         min =          9
     Between = 0.0000                                         avg =        9.0
     Overall = 0.9311                                         max =          9

                                                Wald chi2(8)      =   25861.96
corr(u_i, X) = 0 (assumed)                      Prob > chi2       =     0.0000

------------------------------------------------------------------------------
      weight | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
        week |
          2  |   6.760417   .4231323    15.98   0.000     5.931093    7.589741
          3  |   13.84375   .4231323    32.72   0.000     13.01443    14.67307
          4  |     19.375   .4231323    45.79   0.000     18.54568    20.20432
          5  |   25.13542   .4231323    59.40   0.000     24.30609    25.96474
          6  |   31.42708   .4231323    74.27   0.000     30.59776    32.25641
          7  |    37.4375   .4231323    88.48   0.000     36.60818    38.26682
          8  |   44.28125   .4231323   104.65   0.000     43.45193    45.11057
          9  |   50.19792   .4231323   118.63   0.000     49.36859    51.02724
             |
       _cons |   25.02083   .6365547    39.31   0.000     23.77321    26.26846
-------------+----------------------------------------------------------------
     sigma_u |  3.8926478
     sigma_e |  2.0729165
         rho |  .77907191   (fraction of variance due to u_i)
------------------------------------------------------------------------------

.

Kind regards,
Carlo
(Stata 19.0)

Comment

Laura Myles

Join Date: Jun 2018

Posts: 155
#5

07 Jun 2023, 10:29

Great, thanks Carlo.

Including weeks 15 and 21 ensures that wellbeing change is for 3-months interval; does the fact that they have no data impact the model at all? I'd welcome suggestions of readings if they'd help.
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17736
#6

07 Jun 2023, 23:46

Laura:
those data are missing because the time interval the individuals are measured has changed for reasons that should be explained in the study protocol.
Maybe you may change the way time is considered in your dataset and express it in terms of wave of data instead of weeks (measurement #1, #2 and so on).

Kind regards,
Carlo
(Stata 19.0)
Comment
Joseph Coveney

Join Date: Apr 2014

Posts: 4449
#7

08 Jun 2023, 04:47

Originally posted by Laura Myles View Post

. . . can I still set up the model as

Code:

mixed wellbeing i.time || id:

Or if time needs to be recoded somehow?

You do not need to recode the time values.

You can use orthogonal polynomial contrasts (help contrast) with its p. operator.

marginsplot will also respect the spacing between observation time points.

Code:

mixed wellbeing i.time || id: , reml dfmethod(kroger) nolrtest nolog contrast p.time, effects nowald small margins time marginsplot
1 like
Comment
Laura Myles

Join Date: Jun 2018

Posts: 155
#8

13 Jun 2023, 03:27

Thanks Joseph Coveney. I have a follow-up question. The contrast show a significant linear and quadratic contrasts - the linear term has a positive sign while the quadratic one a negative sign. I took this to suggest that the wellbeing scores decrease over time following a quadratic shape? When plotting the data this pattern is not so clear. Can you advise?
Comment
Joseph Coveney

Join Date: Apr 2014

Posts: 4449
#9

13 Jun 2023, 05:54

Not having seen the data (your microsnippet above doesn't show any trend at all), it's difficult to make an informed comment, but try something like the following to see whether the turning point lies within your range of timepoints.

Code:

mixed wellbeing c.time##c.time || id: local peak = -_b[c.time] / 2 /_b[c.time#c.time] margins , at(time = (0 3 6 9 12 18 24)) marginsplot , xline(`peak') // Do you see it? assert inrange(`peak', 0, 24)
2 likes
Comment
Laura Myles

Join Date: Jun 2018

Posts: 155
#10

13 Jun 2023, 08:18

Thanks Joseph Coveney - I ran your code and find that the peak value doesn't lie in the range of timepoints; the assert bit indicated it was false. Looking at previous posts, this suggests that I don't have a U-shaped curve but more curvilinear. IN fact, when I plot my data I find a steeper increase in the initial surveys with wellbeing scores reaching a plateau in more recent ones.

Are the effect tables informative? I am unsure what the contrast (and its standard deviation) indicate when using orthogonal polynomial contrasts
Comment
Joseph Coveney

Join Date: Apr 2014

Posts: 4449
#11

13 Jun 2023, 22:42

Originally posted by Laura Myles View Post

. . .this suggests that I don't have a U-shaped curve but more curvilinear. IN fact, when I plot my data I find a steeper increase in the initial surveys with wellbeing scores reaching a plateau . . .

Yeah, this is what I suspected: the wellbeing self-rating scale is finite on both ends, and you've reached either a floor effect or ceiling effect. You might want to look into whether an unadorned linear mixed model is the best option if there is severe heteroscedasticity in the residuals between the ascending portion of the timecourse and the asymptotic portion. You can check whether adding a robust or clustered standard errors option to the regression model dramatically affects those inferential statistics that are important to the study's objective.

Are the effect tables informative? I am unsure what the contrast (and its standard deviation) indicate when using orthogonal polynomial contrasts

Yes, they are informative and are entirely consistent with what you see in your marginsplot. The contrasts do not demand that you consider your outcome to have a U-shaped curve (although that possibility is entertained), but rather that the response profile has a linear component and a nonlinear component, that is, it's "more curvilinear" and not strictly a linear functional relationship. It's as in polynomial regression, which is usually seen as a first-order approximation done in order to accommodate obvious nonlinearity in the response profile in circumstances where the exact functional form is unknown.
2 likes
Comment
Laura Myles

Join Date: Jun 2018

Posts: 155
#12

14 Jun 2023, 03:23

Thanks again Joseph Coveney - I am unsure what you mean by unadorned linear mixed model; what is it and how can I implement it.

I also find the residuals are not symmetrical, using -symplot-,so I was hoping you could recommend a non-parametric alternative. I have identified the command -emh- which allows to run an extended MH test but I am then unable to adjust for baseline scores. I suppose I would need to include the baseline as a timepoint but I was wondering if there is something else I could use.
Comment
Joseph Coveney

Join Date: Apr 2014

Posts: 4449
#13

14 Jun 2023, 20:05

Originally posted by Laura Myles View Post

. . . unadorned linear mixed model; what is it and how can I implement it.

You have implemented it. It's what you show in code delimiters in your original post. Sorry I wasn't clear. I wasn't recommending that you implement it, but rather that you challenge it as to whether it's adequate for your purpose.

One challenge is to determine whether the standard errors are dramatically affected when you use an alternative vce() option, e.g., robust / cluster.

I also find the residuals are not symmetrical, using -symplot-,so I was hoping you could recommend a non-parametric alternative.

I cannot, I'm sorry. I tend to avoid nonparametric alternatives (and I speak as the author of -emh-).

One alternative to the linear mixed model that you're already using would be a generalized linear mixed model, such as meologit or meoprobit. But the wellbeing self-rating scale seems to have at least seven levels, and so it's not certain that resorting to an ordered-categorical regression model would gain much. And interpreting its results would be more involved.
Comment
Laura Myles

Join Date: Jun 2018

Posts: 155
#14

15 Jun 2023, 03:16

Thanks Joseph Coveney , this is really helpful.

I am afraid, I have one more question. I had to impute some missing data on the covariate variables I now also adjust for. It seems that -mi estimate- does not support -contrast-; is it possible to manually calculate the Wald test from the orthogonal polynomial contrasts after mi estimate?
Comment

Joseph Coveney

Join Date: Apr 2014
Posts: 4449

#15

15 Jun 2023, 20:11

Originally posted by Laura Myles View Post

. . . is it possible to manually calculate the Wald test from the orthogonal polynomial contrasts after mi estimate?

Yes. See the example below.

The example uses the auto dataset and imputes the five missing values for the variable rep78, which is ordered categorical. It then uses the optional [spec] syntax for mi estimate (see its help file) in order to manually compute the four orthogonal polynomial contrasts.

Code:

version 18.0

clear *

// seedem
set seed 959097216

quietly sysuse auto

count if missing(rep78)

// Culling predictors for mulitple imputation of rep78's missing values
stepwise , pr(0.15): ologit rep78 i.foreign c.(price mpg headroom-gear_ratio)
local column_names : colnames r(table)

local predictors
foreach predictor of local column_names {
    if substr("`predictor'", 1, 3) != "cut" local predictors `predictors' `predictor'
}

// Multiple imputation
mi set wide
mi register imputed rep78
mi impute ologit rep78 `predictors', add(10)

// Manual calculation of orthogonal polynomial contrasts (q.rep78)
mi estimate ///
    (Linear: -2 * _b[1b.rep78] - _b[2.rep78] + _b[4.rep78] + 2 * _b[5.rep78]) ///
    (Quadratic: 2 * _b[1b.rep78] - _b[2.rep78] - 2 * _b[3.rep78] - _b[4.rep78] + 2 * _b[5.rep78]) ///
    (Cubic: -_b[1b.rep78] + 2 * _b[2.rep78] - 2 * _b[4.rep78] + _b[5.rep78]) ///
    (Quartic: _b[1b.rep78] - 4 * _b[2.rep78] + 6 * _b[3.rep78] - 4 * _b[4.rep78] + _b[5.rep78]): ///
        regress price i.(foreign rep78) c.(weight length displacement)

exit

Those contrasts are equivalent to contrast's q. syntax for equally spaced levels (which rep78 is). They use the conventional orthogonal polynomial contrast coefficients found in appendices of ANOVA textbooks. (Stata's contrast normalizes them—see the user's manual—and so the effect, its standard error and confidence bounds shown by contrast will be only proportional to these, but the Wald test statistics and p-values will be identical.)

For your case, where the levels of your time variable are not equally spaced, it will be easier to let Stata compute the coefficients for you. Run the code below in order to see how to retrieve them for use in manually computing orthogonal polynomial contrasts with mi estimate. (The top part is just to create a toy dataset for use in the illustration.)

Code:

version 18.0

clear *

// seedem
set seed 24170006

quietly set obs 200
generate int pid = _n
generate double pid_u = rnormal()

quietly expand 7
sort pid
egen byte tim = fill(0 3 6 9 12 18 24 0 3 6 9 12 18 24)

generate double out = pid_u + rnormal()

mixed out i.tim || pid: , nolrtest nolog

contrast p.tim

// Here
matrix list r(L)

matrix define Coefficients =  r(L)[1..6, "out:0b.tim".."out:24.tim"]
matrix list Coefficients

exit

Announcement