Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • create a new variable or new dataset?

    Find sample data below.

    An individual can have a kidney stone......

    first treatment is is to monitor (1), if this fails - the patient can have a stent(2) and if this fails, procedure 3 (nephrostomy)

    I would like to see the

    individual (or rather individuals) who had procedure 1 and see if this has failed to become procedure 2 and subsequently if 2 has failed to become procedure 3.

    The aim is to compare it to those individuals who have procedure (2) first and calculate the risk of failure to become procedure 3.
    I can do the failure statistics, or rather survival statistics.

    But I can not think of a way to create the data.

    1. Should I create a new dataset?
    2. Should I create a new variable; using - bys + date + laterality +mrn

    It would be easier to have one row for each patient, generate a variable that indicates if the procedure (1) has failed to become (2) to become (3) .
    However not sure if this is possible, perhaps an expert in data cleaning can help.

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float(procedure mrn gender age barcodeno fail laterality date2)
    1  12 1  22 12 1 1 14641
    2  12 1  25 17 1 1 15342
    3  12 1  87 20 0 1 16805
    2 103 2  90 76 1 2 16802
    3 103 2 100 98 0 2 18295
    1  65 2  87 90 0 2 17563
    end
    format %td date2
    label values gender f
    label def f 1 "male", modify
    label def f 2 "female", modify
    label values laterality right
    label def right 1 "Right", modify
    label def right 2 "left", modify
    /mrn is the Scottish way for specifying patient identification number
    Of course this data is falsified.

  • #2
    I think I would do this as follows:
    Code:
    isid mrn date2, sort
    assert !missing(procedure)
    
    collapse (first) first_procedure = procedure first_date = date2 start_age = age ///
        (last) last_procedure = procedure last_date = date2 gender, by(mrn laterality)
    
    stset last_date, failure(last_procedure == 3) origin(first_date)

    Comment


    • #3
      Clyde Schechter - thank you for this.
      I didn't know the -collapse- had first and last options.


      I guess, trying to take it a step further

      If I wanted to to see the

      individual (or rather individuals) who had procedure 1 and see if this has failed to become procedure 2 (Let's call this FIRST FAILURE) - risk failure x
      (eg pt mrn 12 who had procedure 1 ( year: 2000) then subsequently had procedure 2 (year 2002)


      AND

      compare it to the individual who started with procedure 2 which then failed to become procedure 3 and subsequently if 2 has failed to become procedure 3 (LAST FAILURE) - risk failure y
      (eg pt mrn 103- had procedure 2 then this failed to become procedure 3 - 2 years later)


      I have achieved the aim to compare those individuals who have LAST FAILURE (procedure 3) - with the function - collapse(last) -thanks clyde (see post 2)

      However, are you able to suggest a modification to review what happens -inbetween first and last, ie comparing, risk failure x to risk of failure y.-. As there is no -inbetween- option with collapse
      I tried the below, which of course only works with the last date.


      Code:
      
      collapse (first) first_procedure = procedure first_date= date2 start_age = age ///
      (last) first_failure= procedure first_failuredate=date2 gender, by(mrn laterality)

      Comment


      • #4
        Perhaps I wasn't that articulate, this is what I Would like to see (FIRST FAILURE - 2ND FAILURE - 3RD FAILURE) and the dates for 1st, failure, 2ndfailure and 3rd failure.
        Clye has shown me how to do first and last, but what happens for the 'inbetween i.e 2nd failure')
        Then I will just keep one line for each mrn

        Code:
        * Example generated by -dataex-. For more info, type help dataex
        clear
        input float(procedure mrn gender age barcodeno failure laterality opdate firstfail failure_2nd failure_3rd)
        1  12 1  22 12 1 1 14641 1 1 0
        2  12 1  25 17 1 1 15342 1 1 0
        3  12 1  87 20 0 1 16805 1 1 0
        1  65 2  87 90 0 2 17563 0 0 0
        2 103 2  90 76 1 2 16802 1 0 0
        3 103 2 100 98 0 2 18295 1 0 0
        end
        format %td opdate
        label values procedure proc
        label def proc 1 "monitor", modify
        label def proc 2 "stent", modify
        label def proc 3 "nephrostomy", modify
        label values gender f
        label def f 1 "male", modify
        label def f 2 "female", modify
        label values laterality right
        label def right 1 "Right", modify
        label def right 2 "left", modify
        Last edited by Denise Vella; 10 Nov 2023, 05:38.

        Comment


        • #5
          Hello, managed to solve it


          Code:
          /Create order
          bys mrn gender laterality: gen total_noofentries = _N
          bys mrn gender laterality: gen orderof_entries= _n
          
          //Create first - 2nd - 3rd failures
          bys mrn gender laterality: gen test_FIRST_failure = 1 if fail ==1 & orderof_entries == 1
          bys mrn gender laterality: gen test_SECON_failure = 1 if fail ==1 & orderof_entries == 2
          bys mrn gender laterality: gen test_THIRD_failure = 1 if fail ==1 & orderof_entries == 3
          
          //Create dates for first - 2nd - 3rd failures
          gen first_datefailure = date2 if test_FIRST_failure == 1
          format first_datefailure %td
          
          //create dates for Second failure
          gen second_datefailure = date2 if test_SECON_failure == 1
          format second_datefailure %td
          
          //trying to get everything in one row
          
          //First failure 
          replace test_FIRST_failure = 0 if test_FIRST_failure !=1 
          rangestat (max) Final_1st_revision = test_FIRST_failure, interval(test_FIRST_failure 0 1) by(mrn gender laterality)
          
          //First date faiure
          replace first_datefailure = 0 if first_datefailure ==. 
          rangestat (max) Final_1_date_revision = first_datefailure, interval(first_datefailure 0 .) by(mrn gender laterality)
          format Final_1_date_revision %td
          
          
          
          //Second failure 
          replace test_SECON_failure = 0 if test_SECON_failure !=1 
          rangestat (max) Final_2nd_rev = test_SECON_failure, interval(test_SECON_failure 0 1) by(mrn gender laterality)
          
          
          replace second_datefailure = 0 if second_datefailure ==. 
          rangestat (max) Second_2_date_revision = second_datefailure, interval(second_datefailure 0 .) by(mrn gender laterality)
          format Second_2_date_revision %td
          
          
          keep if orderof_entries == 1

          Comment

          Working...
          X