Dear all,
I'm new to this forum and have been trying to piece together an approach to a panel data regression I am working on, using Stata 14 and a mixed model.
For background, I have a small unbalanced panel dataset with longitudinal data on 202 people where the dependent variable is a calculated disease severity score (MScore, range 0-100). The score sampling is at different time points (days), where day 0 is defined as the first score for that patient, and subsequent days calculated from there per patient – eg. one person may have assessments at day 0/13/46, and another at day 0/22/34/59, etc.
As the first score for each patient (day 0) could be taken at a different disease stage, I expect the baseline MScore to vary quite a bit between patients.
I am ultimately interested in a simple linear regression to model MScore against days, and using the predicted values to derive a slope of decline in MScore per day for each patient.
I would then like to compare the mean slopes of time-invariant groups (eg. Gender).
The data I'm interested in look like this:
I have a few questions around this:
1. As I would like individual slopes (not mean group slope), I have used a mixed model with RE around the individual patient (to account for different slope/intercept per patient) - is this reasonable?
2. Is it correct to derive the individual slopes (including random effects) using:
Where b1 is the individual slope with RE - ie change in MScore per day?
3. I would like to look at the effects of baseline - ie. if the mean slope per baseline varies. I'm not sure how to go about this; would it be best to use the predictions from 2. to regress against baseline? I understand I could use baseline as a covariate in the original mixed model but I don't want to lose this initial datapoint in slopes, as some patients only have 2 measures. Could anyone suggest an approach?
Many thanks for your help
I'm new to this forum and have been trying to piece together an approach to a panel data regression I am working on, using Stata 14 and a mixed model.
For background, I have a small unbalanced panel dataset with longitudinal data on 202 people where the dependent variable is a calculated disease severity score (MScore, range 0-100). The score sampling is at different time points (days), where day 0 is defined as the first score for that patient, and subsequent days calculated from there per patient – eg. one person may have assessments at day 0/13/46, and another at day 0/22/34/59, etc.
As the first score for each patient (day 0) could be taken at a different disease stage, I expect the baseline MScore to vary quite a bit between patients.
I am ultimately interested in a simple linear regression to model MScore against days, and using the predicted values to derive a slope of decline in MScore per day for each patient.
I would then like to compare the mean slopes of time-invariant groups (eg. Gender).
The data I'm interested in look like this:
Code:
xtset ID days
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input int ID double rank float MScore double baseline int days 8 1 33.284023 33.28402328491211 0 8 2 0 33.28402328491211 113 24 1 48.37278 48.372779846191406 0 24 2 48.37278 48.372779846191406 28 24 3 48.37278 48.372779846191406 55 24 4 38.7574 48.372779846191406 141 24 5 26.035503 48.372779846191406 218 38 1 38.7574 38.75739669799805 0 49 1 59.1716 59.17159652709961 0 49 2 59.1716 59.17159652709961 42 49 3 59.1716 59.17159652709961 69 49 4 59.1716 59.17159652709961 118 49 5 53.40237 59.17159652709961 182 49 6 66.27219 59.17159652709961 252 49 7 53.40237 59.17159652709961 357 49 8 15.384615 59.17159652709961 1078 54 1 33.284023 33.28402328491211 0 54 2 0 33.28402328491211 47 62 1 33.284023 33.28402328491211 0 62 2 15.384615 33.28402328491211 23 64 1 38.7574 38.75739669799805 0 66 1 48.37278 48.372779846191406 0 66 2 38.7574 48.372779846191406 22 67 1 48.37278 48.372779846191406 0 67 2 43.63905 48.372779846191406 16 67 3 48.37278 48.372779846191406 59 67 4 48.37278 48.372779846191406 88 67 5 0 48.372779846191406 152 73 1 33.284023 33.28402328491211 0 74 1 38.7574 38.75739669799805 0 74 2 26.035503 38.75739669799805 27 76 1 74.85207 74.8520736694336 0 76 2 48.37278 74.8520736694336 20 76 3 48.37278 74.8520736694336 48 end
I have a few questions around this:
1. As I would like individual slopes (not mean group slope), I have used a mixed model with RE around the individual patient (to account for different slope/intercept per patient) - is this reasonable?
Code:
mixed OfficialMotor100 days if dcode==1 & baseline>29 & preterm_Ex!=1 || ID:days
2. Is it correct to derive the individual slopes (including random effects) using:
Code:
predict r1 r0, reffects gen b0 = _b[_cons] +r0 gen b1 = _b[days] +r1
3. I would like to look at the effects of baseline - ie. if the mean slope per baseline varies. I'm not sure how to go about this; would it be best to use the predictions from 2. to regress against baseline? I understand I could use baseline as a covariate in the original mixed model but I don't want to lose this initial datapoint in slopes, as some patients only have 2 measures. Could anyone suggest an approach?
Many thanks for your help