Linear mixed effect model to predict mean change of disease severity score per year using xtmixed command

Joseph Coveney

Join Date: Apr 2014

Posts: 4420
#16

15 Aug 2018, 20:11

Originally posted by Dana Rose Garfin View Post

do you have any thoughts about which would be the most appropriate method?

Is there anything the matter with Stata's defaults? I suppose if you had only a handful of patients' data and wanted to use Kenward-Roger adjustments, then you'd have to use REML, but you have 600 patients' worth.
Comment
Hui SHI

Join Date: Nov 2021

Posts: 29
#17

03 Jan 2023, 20:32

Originally posted by Joseph Coveney View Post

Clyde, could you elaborate on that? I don't quite follow what you're trying to say. Albeit, the common regression coefficients and variance estimates might turn out to be similar or even nearly identical, but to me the models are different, with different likelihoods. One fits a covariance between random slope and intercept and the other model doesn't.

Code:

version 15.1 clear * set seed `=strreverse("1454205")' quietly drawnorm intercept slope, double corr(1 -0.25 \ -0.25 1) n(250) generate int pid = _n quietly expand 4 bysort pid: generate double tim = _n - 2.5 generate double out = -1 + tim / 100 + intercept + slope * tim + rnormal() mixed out c.tim || pid: tim , covariance(unstructured) stddeviations nolrtest nolog estimates store Full mixed out c.tim || pid: tim, stddeviations nolrtest nolog lrtest Full exit

*================================================= =====================================
Hi all,

I am confused about the random effect. Could someone tell me the difference of the following two models?

(1) mixed out c.tim || pid: tim , covariance(unstructured)
(2) mixed out c.tim || pid, covariance(unstructured)

I did not know when I should only include subject_id, or subject_id: time, or subject_id: group?
how to decide which one is right? How to interpret the random effect in these three random effect?
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30117
#18

03 Jan 2023, 20:44

The difference between (1) and (2) is that in (2) there is a single estimated coefficient for tim that applies to all the different pid's. In (1) your model assumes that each pid has its own distinct slope for the variable tim. Similarly, in a model with || subject_id: group, you are assuming that each subject_id has its own distinct slope for the variable group. So you have to think about what you believe is the most plausible relationship among these variables. Is it plausible that tim has the same slope (as a predictor of out) for all pid's (or close enough for practical purposes), or is it more reasonable to assume that the out:tim slope is different for different pid's. When you answer that question, you pick the corresponding model.
Comment

Announcement

Comment

Comment

Comment