Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Survival analysis and problems with multiple rows for each patient

    Hello everyone,

    I'm working with survival analysis and a cox model in a dataset collected from an RCT. The version of Stata is 16.1.
    In the two dataex below, The variable V2 is the patient ID. For each number of V2 there is only one row with data on status and months (time to status), and this is the last row for each patient. The data on status and months are missing in rest of the rows, however these rows contain other valuable information, e.g. diagnoses variables and drug variables (potential risk factors). The number of other rows for each patient varies.
    My question is how can I move the information about diagnoses and drugs from other rows on the same patient ID, to the last row for each patient that contains values on the variables mnd and status? E.g. something like this:

    V2 mnd status hjsvikt diabetes M01A_ C09_

    551 .75564681724846 1 0 0 1 0

    (552 - 559 : examples not included)

    560 .3613963039014374 1 1 0 1 1



    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input double(V2 mnd status) long(hjsvikt diabetes) float(M01A_ C09_)
    551               . . 0 0 0 0
    551               . . 0 0 0 0
    551               . . 0 0 0 0
    551               . . 0 0 0 0
    551               . . 0 0 0 0
    551               . . 0 0 0 0
    551               . . 0 0 0 0
    551               . . 0 0 0 0
    551               . . 0 0 0 0
    551               . . 0 0 0 0
    551 .75564681724846 1 0 0 0 0
    551               . . 0 0 0 0
    551               . . 0 0 1 0
    551               . . 0 0 0 0
    551               . . 0 0 0 0
    551               . . 0 0 0 0
    551               . . 0 0 0 0
    551               . . 0 0 0 0
    551               . . 0 0 0 0
    551               . . 0 0 0 0
    551               . . 0 0 0 0
    551               . . 0 0 0 0
    551               . . 0 0 0 0
    551               . . 0 0 0 0
    end
    label values status Status_ut_2017
    label def Status_ut_2017 1 "Reinnlagt", modify

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input double(V2 mnd status) long(hjsvikt diabetes) float(M01A_ C09_)
    560                 . . 1 0 0 0
    560                 . . 0 0 0 1
    560                 . . 0 0 0 0
    560                 . . 0 0 0 0
    560                 . . 0 0 0 0
    560                 . . 0 0 0 0
    560                 . . 0 0 0 0
    560                 . . 0 0 0 0
    560                 . . 0 0 0 0
    560                 . . 0 0 0 0
    560                 . . 0 0 0 0
    560                 . . 0 0 0 0
    560                 . . 0 0 0 0
    560                 . . 0 0 1 0
    560                 . . 0 0 1 0
    560 .3613963039014374 1 0 0 0 0
    end
    label values status Status_ut_2017
    label def Status_ut_2017 1 "Reinnlagt", modify
    Other information: I think I need to do this because otherwise, stata omits the information in the rows which do not have survival data (status, time to status). I have tried to fill in this data in all the rows with the xfill command, e.g. xfill months, i(V2) and xfill staus, i(V2), however stata does not understand that these rows does not represent the same ID. In the resulting Kaplan Meier plots, there will be 8000 cases, not 400 as it is supposed to be.
    When all the information I need for further analysis is collected in the last row of interest for each patient, I hope it will work to use the drop if status ==. command in stata, and that the dataset is then ready for survival analysis and further estimation of the cox model.
Working...
X