Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Longitudinal data - generating variables dependent on observations within each subject

    Hi everyone,

    I have longitudinal data (see dataex below). I need to censor each id according to a few conditions.
    Condition 1: if within the same id, treatment = 2 occurs on the same date as treatment = 1, I need to use that treat_date as the censoring date.

    Condition 2: if within the same id, there is a delay between treat_date >= x days for the same treatment, I need to censor at that date plus a specified add-on duration (say 10 days)

    How do I look within each id to determine if these conditions occur?


    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input int id byte treatment float treat_date
    11 1 20054
    11 1 20102
    11 1 20176
    11 1 20209
    11 1 20247
    11 1 20332
    11 2 20391
    11 2 20519
    11 2 20576
    11 3 20434
    11 3 20450
    11 5 20585
    11 5 20618
    11 5 20630
    11 5 20675
    11 5 20746
    12 1 19401
    12 1 19403
    12 1 19460
    12 1 19797
    12 2 19686
    12 2 19716
    12 3 19529
    12 3 19539
    12 3 19567
    12 3 19627
    12 3 19787
    12 4 19849
    12 4 19915
    12 4 19922
    12 4 19954
    12 4 19973
    12 4 20025
    12 4 20112
    12 4 20197
    12 4 20225
    12 4 20235
    12 4 20278
    12 4 20332
    12 4 20352
    12 4 20365
    12 4 20401
    12 4 20474
    12 4 20546
    12 4 20636
    12 4 20709
    12 4 20759
    12 4 20761
    12 5 19683
    12 5 19749
    12 5 19754
    13 1 20420
    13 1 20432
    13 1 20496
    13 1 20547
    13 1 20577
    13 1 20664
    13 1 20695
    13 1 20752
    14 2 17348
    14 2 17368
    14 2 17828
    14 2 17902
    14 2 17916
    14 2 18228
    14 2 18270
    14 2 18318
    14 2 18377
    14 2 18440
    14 2 18467
    14 2 18490
    14 2 18542
    14 3 17262
    14 3 17319
    14 3 17448
    14 3 17453
    14 3 17461
    14 3 17494
    14 3 17521
    14 3 17598
    14 3 17602
    14 3 17663
    14 3 17694
    14 3 17732
    14 3 17759
    14 3 17918
    14 3 18091
    14 3 18169
    14 4 18598
    14 4 18688
    14 4 18701
    14 4 18746
    14 4 18820
    14 4 18899
    14 4 18977
    14 4 19064
    14 4 19126
    14 5 17990
    14 5 18031
    15 1 18241
    end
    format %td treat_date
    ---------

  • #2
    I'm not sure I can help you with all of them, but hopefully the code below can help you get started. Also, you might find the posts here, here, and here helpful.


    Code:
    bysort id (treatment treat_date): gen date_diff = treat_date - treat_date[_n-1]
    bysort id (treat_date): gen same_day = (treat_date == treat_date[_n-1])  // 1 if patient had 2 or more treatments on same day
    bysort id treatment (treat_date): gen d_diff2 = treat_date - treat_date[_n-1]
    gen too_long = 1 if d_diff2 >=10 & d_diff2 !=.  // 1 if delay between same treatment >= 10 days
    
    . list in 1/40, sepby(id treatment) noobs abbrev(12)
    
      +-------------------------------------------------------------------------+
      | id   treatment   treat_date   date_diff   same_day   d_diff2   too_long |
      |-------------------------------------------------------------------------|
      | 11           1    27nov2014           .          0         .          . |
      | 11           1    14jan2015          48          0        48          1 |
      | 11           1    29mar2015          74          0        74          1 |
      | 11           1    01may2015          33          0        33          1 |
      | 11           1    08jun2015          38          0        38          1 |
      | 11           1    01sep2015          85          0        85          1 |
      |-------------------------------------------------------------------------|
      | 11           2    30oct2015          59          0         .          . |
      | 11           2    06mar2016         128          0       128          1 |
      | 11           2    02may2016          57          0        57          1 |
      |-------------------------------------------------------------------------|
      | 11           3    12dec2015        -142          0         .          . |
      | 11           3    28dec2015          16          0        16          1 |
      |-------------------------------------------------------------------------|
      | 11           5    11may2016         135          0         .          . |
      | 11           5    13jun2016          33          0        33          1 |
      | 11           5    25jun2016          12          0        12          1 |
      | 11           5    09aug2016          45          0        45          1 |
      | 11           5    19oct2016          71          0        71          1 |
      |-------------------------------------------------------------------------|
      | 12           1    12feb2013           .          0         .          . |
      | 12           1    14feb2013           2          0         2          . |
      | 12           1    12apr2013          57          0        57          1 |
      | 12           1    15mar2014         337          0       337          1 |
      |-------------------------------------------------------------------------|
      | 12           2    24nov2013        -111          0         .          . |
      | 12           2    24dec2013          30          0        30          1 |
      |-------------------------------------------------------------------------|
      | 12           3    20jun2013        -187          0         .          . |
      | 12           3    30jun2013          10          0        10          1 |
      | 12           3    28jul2013          28          0        28          1 |
      | 12           3    26sep2013          60          0        60          1 |
      | 12           3    05mar2014         160          0       160          1 |
      |-------------------------------------------------------------------------|
      | 12           4    06may2014          62          0         .          . |
      | 12           4    11jul2014          66          0        66          1 |
      | 12           4    18jul2014           7          0         7          . |
      | 12           4    19aug2014          32          0        32          1 |
      | 12           4    07sep2014          19          0        19          1 |
      | 12           4    29oct2014          52          0        52          1 |
      | 12           4    24jan2015          87          0        87          1 |
      | 12           4    19apr2015          85          0        85          1 |
      | 12           4    17may2015          28          0        28          1 |
      | 12           4    27may2015          10          0        10          1 |
      | 12           4    09jul2015          43          0        43          1 |
      | 12           4    01sep2015          54          0        54          1 |
      | 12           4    21sep2015          20          0        20          1 |
      +-------------------------------------------------------------------------+
    
    * Note that in your sample data, no patient had 2 treatments on the same day
    . tabulate same_day
    
       same_day |      Freq.     Percent        Cum.
    ------------+-----------------------------------
              0 |        100      100.00      100.00
    ------------+-----------------------------------
          Total |        100      100.00
    A few quick notes:
    1) There are many instances where a higher treatment occurred before a lower treatment number (for example, patient #11 received treatment==3 142 days before his/her last treatment 2)--I don't know if you want to flag those.

    2) When you say you want to censor them, do you want to stop tracking them on that date, or do you want to know the date that first occurred.

    3) There are no duplicate dates, so by definition there are none where treatment 1 & treatment 2 occurred on the same date.

    4) You'll note that for same_day I set it equal 1 if the condition was met and blank otherwise. This is nice for sorting because the 1 will go first (and the blanks after), but it's normally not a good practice to leave blanks for the other observations.

    If there were, you could do the following (I changed 2 of the dates so they were the same):
    Code:
    sort id treat_date treatment
    bysort id (treat_date): gen same_day = 1 if treat_date == treat_date[_n-1]  // this marks a 1 only for the 2nd duplicate
    duplicates tag id treat_date, gen(s_day2) // this marks 1 for both of the duplicates
    bysort id ( treat_date): gen to_censor = (same_day==1 & treatment==2 & treatment[_n-1]==1)
    
    . list in 1/40, sepby(id treatment) noobs abbrev(12)
    
      +-------------------------------------------------------------------------+
      | id   treatment   treat_date   date_diff   same_day   s_day2   to_censor |
      |-------------------------------------------------------------------------|
      | 11           1    27nov2014           .          .        0           0 |
      | 11           1    14jan2015          48          .        0           0 |
      | 11           1    29mar2015          74          .        0           0 |
      | 11           1    01may2015          33          .        0           0 |
      | 11           1    08jun2015          38          .        0           0 |
      | 11           1    30oct2015         144          .        1           0 |
      |-------------------------------------------------------------------------|
      | 11           2    30oct2015           0          1        1           1 |
      |-------------------------------------------------------------------------|
      | 11           3    12dec2015          43          .        0           0 |
      | 11           3    28dec2015          16          .        0           0 |
      |-------------------------------------------------------------------------|
      | 11           2    06mar2016          69          .        0           0 |
      | 11           2    02may2016          57          .        0           0 |
      |-------------------------------------------------------------------------|
      | 11           5    11may2016           9          .        0           0 |
      | 11           5    13jun2016          33          .        0           0 |
      | 11           5    25jun2016          12          .        0           0 |
      | 11           5    09aug2016          45          .        0           0 |
      | 11           5    19oct2016          71          .        0           0 |
      |-------------------------------------------------------------------------|
      | 12           1    12feb2013           .          .        0           0 |
      | 12           1    14feb2013           2          .        0           0 |
      |-------------------------------------------------------------------------|
      | 12           3    20jun2013         126          .        0           0 |
      | 12           3    30jun2013          10          .        0           0 |
      | 12           3    28jul2013          28          .        0           0 |
      | 12           3    26sep2013          60          .        0           0 |
      |-------------------------------------------------------------------------|
      | 12           5    21nov2013          56          .        0           0 |
      |-------------------------------------------------------------------------|
      | 12           1    24nov2013           0          .        1           0 |
      |-------------------------------------------------------------------------|
      | 12           2    24nov2013           3          1        1           1 |
      | 12           2    24dec2013          30          .        0           0 |
      |-------------------------------------------------------------------------|
      | 12           5    26jan2014          33          .        0           0 |
      | 12           5    31jan2014           5          .        0           0 |
      |-------------------------------------------------------------------------|
      | 12           3    05mar2014          33          .        0           0 |
      |-------------------------------------------------------------------------|
      | 12           1    15mar2014          10          .        0           0 |
      |-------------------------------------------------------------------------|
      | 12           4    06may2014          52          .        0           0 |
      | 12           4    11jul2014          66          .        0           0 |
      | 12           4    18jul2014           7          .        0           0 |
      | 12           4    19aug2014          32          .        0           0 |
      | 12           4    07sep2014          19          .        0           0 |
      | 12           4    29oct2014          52          .        0           0 |
      | 12           4    24jan2015          87          .        0           0 |
      | 12           4    19apr2015          85          .        0           0 |
      | 12           4    17may2015          28          .        0           0 |
      | 12           4    27may2015          10          .        0           0 |
      +-------------------------------------------------------------------------+

    Comment


    • #3
      Thanks David,

      Originally posted by David Benson;n1481428

      [CODE
      bysort id (treatment treat_date): gen date_diff = treat_date - treat_date[_n-1]
      bysort id (treat_date): gen same_day = (treat_date == treat_date[_n-1]) // 1 if patient had 2 or more treatments on same day
      bysort id treatment (treat_date): gen d_diff2 = treat_date - treat_date[_n-1]
      gen too_long = 1 if d_diff2 >=10 & d_diff2 !=. // 1 if delay between same treatment >= 10 days

      .
      I think what you helped with above will give me what I need (I did some other data manipulation and I think this will work).

      However, say now I want to assign the date that occurred previous to the too_long==1 to an id and drop all other values for that id (i.e. take the treat_date in line two (line before too_long==1) of the dataex below and generate a date variable = to that date) then keep only 1 line per id?

      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input int id byte treatment float(treat_date date_diff same_day d_diff2 too_long)
      1 1 17903   . 0   . .
      1 1 17942  39 0  39 .
      1 1 18011  69 0  69 1
      1 1 18050  39 0  39 .
      1 1 18091  41 0  41 .
      1 1 18169  78 0  78 1
      1 1 18185  16 0  16 .
      1 1 18198  13 0  13 .
      1 1 18220  22 0  22 .
      1 1 18286  66 0  66 1
      1 1 18312  26 0  26 .
      1 1 18372  60 0  60 1
      1 1 18428  56 0  56 .
      1 1 18508  80 0  80 1
      1 1 18528  20 0  20 .
      1 1 18534   6 0   6 .
      1 1 18546  12 0  12 .
      end
      format %td treat_date

      Comment


      • #4
        See if this gets you what you want. I brought in the data from post #1.

        Code:
        sort id treatment treat_date
        bysort id ( treatment treat_date): gen n = _n
        bysort id (treatment treat_date): gen date_diff = treat_date - treat_date[_n-1]  // this is days between treat_date for *same* treatment
        gen too_long = 1 if date_diff>=10 & date_diff<.
        bysort id ( too_long treatment treat_date): gen f_too_long = n[1] if too_long[1]==1  // finding first period where gap was too long
        egen c_too_long = total( too_long), by(id)  // c_too_long is "count of instances where gap in treatment > 10 days"
        gen ever_too_long = ( c_too_long >=1)  // fills in for patient if they ever have a gap > 10 days
        sort id n
        
        . list if n<=5, sepby( id) noobs abbrev(14)
        
          +--------------------------------------------------------------------------------------------------+
          | id   treatment   treat_date   n   date_diff   too_long   f_too_long   c_too_long   ever_too_long |
          |--------------------------------------------------------------------------------------------------|
          | 11           1    27nov2014   1           .          .            2           14               1 |
          | 11           1    14jan2015   2          48          1            2           14               1 |
          | 11           1    29mar2015   3          74          1            2           14               1 |
          | 11           1    01may2015   4          33          1            2           14               1 |
          | 11           1    08jun2015   5          38          1            2           14               1 |
          |--------------------------------------------------------------------------------------------------|
          | 12           1    12feb2013   1           .          .            3           27               1 |
          | 12           1    14feb2013   2           2          .            3           27               1 |
          | 12           1    12apr2013   3          57          1            3           27               1 |
          | 12           1    15mar2014   4         337          1            3           27               1 |
          | 12           2    24nov2013   5        -111          .            3           27               1 |
          |--------------------------------------------------------------------------------------------------|
          | 13           1    28nov2015   1           .          .            2            7               1 |
          | 13           1    10dec2015   2          12          1            2            7               1 |
          | 13           1    12feb2016   3          64          1            2            7               1 |
          | 13           1    03apr2016   4          51          1            2            7               1 |
          | 13           1    03may2016   5          30          1            2            7               1 |
          |--------------------------------------------------------------------------------------------------|
          | 14           2    01jul2007   1           .          .            2           34               1 |
          | 14           2    21jul2007   2          20          1            2           34               1 |
          | 14           2    23oct2008   3         460          1            2           34               1 |
          | 14           2    05jan2009   4          74          1            2           34               1 |
          | 14           2    19jan2009   5          14          1            2           34               1 |
          |--------------------------------------------------------------------------------------------------|
          | 15           1    10dec2009   1           .          .            .            0               0 |
          +--------------------------------------------------------------------------------------------------+
        
        
        * Dropping observations
        drop if ever_too_long==1 & n> f_too_long  // keeps all periods up to first gap > 10 days.  Leaves those who never have gap untouched.
        
        . list if n<=5, sepby( id) noobs abbrev(14)
        
          +--------------------------------------------------------------------------------------------------+
          | id   treatment   treat_date   n   date_diff   too_long   f_too_long   c_too_long   ever_too_long |
          |--------------------------------------------------------------------------------------------------|
          | 11           1    27nov2014   1           .          .            2           14               1 |
          | 11           1    14jan2015   2          48          1            2           14               1 |
          |--------------------------------------------------------------------------------------------------|
          | 12           1    12feb2013   1           .          .            3           27               1 |
          | 12           1    14feb2013   2           2          .            3           27               1 |
          | 12           1    12apr2013   3          57          1            3           27               1 |
          |--------------------------------------------------------------------------------------------------|
          | 13           1    28nov2015   1           .          .            2            7               1 |
          | 13           1    10dec2015   2          12          1            2            7               1 |
          |--------------------------------------------------------------------------------------------------|
          | 14           2    01jul2007   1           .          .            2           34               1 |
          | 14           2    21jul2007   2          20          1            2           34               1 |
          |--------------------------------------------------------------------------------------------------|
          | 15           1    10dec2009   1           .          .            .            0               0 |
          +--------------------------------------------------------------------------------------------------+
        
        keep if n + 1 == f_too_long  // keeps only the period right before too_long==1  (note: it also deletes all patients who never have a 10 day gap.)
        // keep if ever_too_long==0 | (ever==1 & n + 1 == f_too_long)
        
        . list if n<=5, noobs abbrev(14)
        
          +--------------------------------------------------------------------------------------------------+
          | id   treatment   treat_date   n   date_diff   too_long   f_too_long   c_too_long   ever_too_long |
          |--------------------------------------------------------------------------------------------------|
          | 11           1    27nov2014   1           .          .            2           14               1 |
          | 12           1    14feb2013   2           2          .            3           27               1 |
          | 13           1    28nov2015   1           .          .            2            7               1 |
          | 14           2    01jul2007   1           .          .            2           34               1 |
          +--------------------------------------------------------------------------------------------------+
        A few notes:
        1) The above counts days between treatments of the same type
        2) I'm not sure how you want to handle patients who never have > 10 day gap in treatment

        Also, I was thinking that you might want the first_date, last_date (regardless if censored), and total number of treatments a patient received.
        Code to do that (do this before the steps I gave you)
        Code:
        egen first_date = min( treat_date), by(id)
        egen last_date  = max( treat_date), by(id)
        format first_date last_date %td
        bysort id ( treat_date): gen count_treatments = _N
        bysort id ( treatment treat_date): gen n = _n
        
        . list id treatment treat_date first_date last_date count_treatments if n==1, noobs abbrev(18)
        
          +-------------------------------------------------------------------------+
          | id   treatment   treat_date   first_date   last_date   count_treatments |
          |-------------------------------------------------------------------------|
          | 11           1    27nov2014    27nov2014   19oct2016                 16 |
          | 12           1    12feb2013    12feb2013   03nov2016                 35 |
          | 13           1    28nov2015    28nov2015   25oct2016                  8 |
          | 14           2    01jul2007    06apr2007   13may2012                 40 |
          | 15           1    10dec2009    10dec2009   10dec2009                  1 |
          +-------------------------------------------------------------------------+

        Comment

        Working...
        X