Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Create variable with value from later event date

    Hi There - I have patient-level data on events of care for different conditions. The data are structured as below (example for a single patient):
    Patient Id Cond Id Event Date
    1 1 01/01/2010
    1 1 01/05/2013
    1 2 01/07/2014
    1 3 01/12/2016
    1 1 01/01/2017
    1 2 01/04/2017
    1 2 01/01/2018
    1 3 01/10/2018
    1 3 01/10/2019
    1 2 01/09/2020
    I would like to create a new variable including the date when an event related to condition ID "2" has occurred after the observation event date. The output I am looking to replicate would look like the fourth column of the table below - for each observation, the fourth column includes the date when the next event related to condition "2" has occurred.

    Patient Id Cond Id Event Date Cond 2 Event Date
    1 1 01/01/2010 01/07/2014
    1 1 01/05/2013 01/07/2014
    1 2 01/07/2014 01/04/2017
    1 3 01/12/2016 01/04/2017
    1 1 01/01/2017 01/04/2017
    1 2 01/04/2017 01/01/2018
    1 2 01/01/2018 01/09/2020
    1 3 01/10/2018 01/09/2020
    1 3 01/10/2019 01/09/2020
    1 2 01/09/2020 .
    Any advice on how this variable may be created? The difficulty here seems that the information to extract from the Event Date column is always in a different position. Also, I would indeed need to create the variable for each patient in the data so I suspect a "by Patient" command is needed?
    Many thanks in advance!

  • #2
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input byte(patientid condid) float eventdate
    1 1 18263
    1 1 19363
    1 2 19730
    1 3 20465
    1 1 20820
    1 2 20823
    1 2 21185
    1 3 21194
    1 3 21559
    1 2 21923
    end
    format %td eventdate
    
    
    bys patientid (eventdate): gen wanted= eventdate[_n+1] if condid[_n+1]==2
    gsort patientid -eventdate
    by patientid: replace wanted= wanted[_n-1] if missing(wanted) & !missing(wanted[_n-1])
    format wanted %td
    sort patientid eventdate
    Res.:

    Code:
    . l, sep(0)
    
         +-------------------------------------------+
         | patien~d   condid   eventdate      wanted |
         |-------------------------------------------|
      1. |        1        1   01jan2010   07jan2014 |
      2. |        1        1   05jan2013   07jan2014 |
      3. |        1        2   07jan2014   04jan2017 |
      4. |        1        3   12jan2016   04jan2017 |
      5. |        1        1   01jan2017   04jan2017 |
      6. |        1        2   04jan2017   01jan2018 |
      7. |        1        2   01jan2018   09jan2020 |
      8. |        1        3   10jan2018   09jan2020 |
      9. |        1        3   10jan2019   09jan2020 |
     10. |        1        2   09jan2020           . |
         +-------------------------------------------+
    Last edited by Andrew Musau; 31 May 2022, 10:58.

    Comment


    • #3
      Code:
      * Example generated by -dataex-. For more info, type help dataex
      clear
      input byte(patientid condid) float eventdate
      1 1 18263
      1 1 19363
      1 2 19730
      1 3 20465
      1 1 20820
      1 2 20823
      1 2 21185
      1 3 21194
      1 3 21559
      1 2 21923
      end
      format %td eventdate
      
      preserve
      keep if condid == 2
      drop condid
      rename eventdate cond_2_eventdate
      tempfile event2date
      save `event2date'
      
      restore
      gen long obs_no = _n
      gen eventdate1 = eventdate + 1
      rangejoin cond_2_eventdate eventdate1 . using `event2date', by(patientid)
      by obs_no (cond_2_eventdate), sort: keep if _n == 1
      drop eventdate1 obs_no
      This code requires the -rangejoin- command, by Robert Picard, available from SSC. To use -rangejoin- you must also install -rangestat-, by Robert Picard, Nick Cox, and Roberto Ferrer, also available from SSC.

      The code assumes that in your real data set, eventdate is a true Stata internal format date variable. If it is not, you must convert it accordingly before running this code.

      In the future, when showing data examples, please use the -dataex- command to do so, as I have here. If you are running version 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

      Added: Crossed with #2 which offers a different solution.
      Last edited by Clyde Schechter; 31 May 2022, 11:03.

      Comment


      • #4
        Thank you both for the suggested solutions - and the tip to use dataex in the future.

        Solution #2 from Andrew does not meet my need as the coding fails to link the right Eventdate values unless the _n+1 observation is associated with CondId==2. Please note that the data example provided is only to illustrate the structure of the actual data - the sequence of actual values may be different for other patients.

        I will test Clyde's solution next - sorry for the delay, I cannot install new commands directly as I am working on a secure network. I will feedback as soon as the new commands are made available.

        Comment

        Working...
        X