Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Mixed linear regression using unbalanced panel data - is this approach reasonable?

    Dear all,

    I'm new to this forum and have been trying to piece together an approach to a panel data regression I am working on, using Stata 14 and a mixed model.

    For background, I have a small unbalanced panel dataset with longitudinal data on 202 people where the dependent variable is a calculated disease severity score (MScore, range 0-100). The score sampling is at different time points (days), where day 0 is defined as the first score for that patient, and subsequent days calculated from there per patient – eg. one person may have assessments at day 0/13/46, and another at day 0/22/34/59, etc.

    As the first score for each patient (day 0) could be taken at a different disease stage, I expect the baseline MScore to vary quite a bit between patients.

    I am ultimately interested in a simple linear regression to model MScore against days, and using the predicted values to derive a slope of decline in MScore per day for each patient.
    I would then like to compare the mean slopes of time-invariant groups (eg. Gender).

    The data I'm interested in look like this:

    Code:
    xtset ID days
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input int ID double rank float MScore double baseline int days
     8 1 33.284023  33.28402328491211    0
     8 2         0  33.28402328491211  113
    24 1  48.37278 48.372779846191406    0
    24 2  48.37278 48.372779846191406   28
    24 3  48.37278 48.372779846191406   55
    24 4   38.7574 48.372779846191406  141
    24 5 26.035503 48.372779846191406  218
    38 1   38.7574  38.75739669799805    0
    49 1   59.1716  59.17159652709961    0
    49 2   59.1716  59.17159652709961   42
    49 3   59.1716  59.17159652709961   69
    49 4   59.1716  59.17159652709961  118
    49 5  53.40237  59.17159652709961  182
    49 6  66.27219  59.17159652709961  252
    49 7  53.40237  59.17159652709961  357
    49 8 15.384615  59.17159652709961 1078
    54 1 33.284023  33.28402328491211    0
    54 2         0  33.28402328491211   47
    62 1 33.284023  33.28402328491211    0
    62 2 15.384615  33.28402328491211   23
    64 1   38.7574  38.75739669799805    0
    66 1  48.37278 48.372779846191406    0
    66 2   38.7574 48.372779846191406   22
    67 1  48.37278 48.372779846191406    0
    67 2  43.63905 48.372779846191406   16
    67 3  48.37278 48.372779846191406   59
    67 4  48.37278 48.372779846191406   88
    67 5         0 48.372779846191406  152
    73 1 33.284023  33.28402328491211    0
    74 1   38.7574  38.75739669799805    0
    74 2 26.035503  38.75739669799805   27
    76 1  74.85207   74.8520736694336    0
    76 2  48.37278   74.8520736694336   20
    76 3  48.37278   74.8520736694336   48
    end

    I have a few questions around this:

    1. As I would like individual slopes (not mean group slope), I have used a mixed model with RE around the individual patient (to account for different slope/intercept per patient) - is this reasonable?

    Code:
    mixed OfficialMotor100 days if dcode==1 & baseline>29 & preterm_Ex!=1 || ID:days

    2. Is it correct to derive the individual slopes (including random effects) using:

    Code:
    predict r1 r0, reffects
    gen b0 = _b[_cons] +r0
    gen b1 = _b[days] +r1
    Where b1 is the individual slope with RE - ie change in MScore per day?


    3. I would like to look at the effects of baseline - ie. if the mean slope per baseline varies. I'm not sure how to go about this; would it be best to use the predictions from 2. to regress against baseline? I understand I could use baseline as a covariate in the original mixed model but I don't want to lose this initial datapoint in slopes, as some patients only have 2 measures. Could anyone suggest an approach?

    Many thanks for your help


    Last edited by Alan Niemann; 18 Jun 2018, 06:36.
Working...
X