Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Timevar for survival analysis

    Dear All,

    This might be a silly question, but it is driving me crazy.

    I am managing data which were not recorded for survival analysis and I am trying to put them in a proper format.

    For the purpose of my question, here my data (I have more variables, but they behave as Var1 and Var2, namely varying during time):
    ID Visit Date DOsp1 DOsp2 Sex Var1 Var2
    1 0 1mar2002 M 0 .
    1 1 3jun2005 M . .
    1 2 4feb2007 M . .
    2 0 9feb2002 21dec2000 22jun2001 F 1 18.9
    2 1 7sep2002 F 2 9999
    3 0 25mar2003 M 0 20
    3 1 13oct2004 M 2 9999
    4 0 4oct2002 F 1 23.5
    4 1 03may2004 4jan2003 24jun2003 F . .
    4 2 13jan2006 F . .
    4 3 25aug2007 F 2 9999

    ID is my person identifier, who can be visited several times (Visit, 0 is the baseline) in different dates (Date is when the visit took place). Each person, during the visit, could say up to 9 dates (I do have DOsp1-DOsp9, but for the sake of this question I just put the first two) regarding if and when they were hospitalized between the visits.

    I will use snapspan in order to convert my data to time-span data, but before I guess I need to slightly change my time variable (and the dataset overall).

    I want to have a timevar like Time (see table below) in order to run snapspan ID Time.

    ID Visit Date DOsp1 DOsp2 Sex Var1 Var2 Time
    1 0 1mar2002 M 0 . 1mar2002
    1 1 3jun2005 M . . 3jun2005
    1 2 4feb2007 M . . 4feb2007
    2 . . . . . . . 21dec2000
    2 . . . . . . . 22jun2001
    2 0 9feb2002 21dec2000 22jun2001 F 1 18.9 9feb2002
    2 1 7sep2002 F 2 9999 7sep2002
    3 0 25mar2003 M 0 20 25mar2003
    3 1 13oct2004 M 2 9999 13oct2004
    4 0 4oct2002 F 1 23.5 4oct2002
    4 . . . . . . . 4jan2003
    4 . . . . . . . 24jun2003
    4 1 03may2004 4jan2003 24jun2003 F . . 03may2004
    4 2 13jan2006 F . . 13jan2006
    4 3 25aug2007 F 2 9999 25aug2007

    This is the final dataset I want to obtain:
    ID Datestarts Dateends Sex Var1 Var2 Event Event_recode
    1 . 1mar2002 M 0 . Visit 0 0
    1 1mar2002 3jun2005 M . . Visit 1 0
    1 3jun2005 4feb2007 M . . Visit 2 0
    2 . 9feb2002 F 1 18.9 Visit 0 0
    2 9feb2002 7sep2002 F 2 9999 Visit 1 2
    3 . 25mar2003 M 0 20 Visit 0 0
    3 25mar2003 13oct2004 M 2 9999 Visit 1 2
    4 . 4oct2002 F 1 23.5 Visit 0 0
    4 4oct2002 4jan2003 F . . Osp 1 1
    4 4jan2003 24jun2003 F . . Osp 2 1
    4 24jun2003 03may2004 F . . Visit 1 0
    4 03may2004 13jan2006 F . . Visit 2 0
    4 13jan2006 25aug2007 F 2 9999 Visit 3 2
    As you might notice, if any date recorded in DOsp1-DOsp9 happened before Visit 0, it will not be taken into account. Then Event_recode will be build in order to have the failure var for my stset (Event_recode will be 0 if the row is regarding a visit, 1 if it is regarding an hospitalization, 2 if the person dies, namely if Var1==2, and then 3 if it is censored).

    All of that, in order to run the following code:

    stset Dataends, id(ID) time0( Datastarts ) origin(time Datastarts ) failure(Event_recode==1 2 ).

    Thank you to anyone who can help me, feel free to ask me clarifications.
    Best
Working...
X