Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Problem writing a loop with multiple layers


    I can’t figure out how to write a loop for this. I think it is the only way to solve this. I do not think rangestat will work.

    I would like to:
    1. For each id/observation, determine the time from start_date to (Date where 50% of persons/id who have an overlapping interval and have the same group are Transplanted)
    Transplanted = 1
    Id = person
    end_date is the last date of follow up, they may have been transplanted, they not have been.

    clear
    input float id str8 group float Start_Date long end_date float Transplanted
    1 "09054" 16807 16839 1
    2 "09054" 16812 16841 1
    3 "09054" 16831 16845 1
    4 "09054" 16838 16848 0
    5 "09054" 16852 16878 1
    6 "09054" 16891 16897 1
    7 "09054" 16898 16900 0
    8 "09054" 16835 16909 1
    9 "09054" 16877 16912 1
    10 "09054" 16908 16916 1
    11 "09952" 16877 16918 0
    12 "09952" 16926 16932 1
    13 "09952" 16940 16946 1
    14 "09952" 16840 16954 1
    15 "09952" 16926 16965 1
    16 "00952" 16908 16966 1
    17 "00952" 16960 16967 1
    18 "00952" 16961 16969 1
    19 "00952" 16968 16979 0
    20 "00952" 16944 16982 1
    21 "29002" 16988 16995 0
    22 "29002" 16975 16999 1
    23 "23002" 16971 17008 1
    24 "23002" 16937 17014 0
    25 "23002" 17017 17022 1
    26 "23002" 17015 17024 1
    27 "23002" 16926 17026 0
    28 "23002" 16924 17032 1
    29 "23002" 16982 17034 1
    30 "23002" 16996 17035 1
    end
    format %d Activation_List_Date
    format %d end_date
    [/CODE]


    I calculated the number of people who had an overlapping interval in the same group using
    by group: generate Total_Group = _n
    **
    generate Number_Removed = .
    local N = _N
    quietly forval i = 1/`N' {
    count if group == group[`i'] & end_date < Start_Date [`i']
    replace Number_removed = r(N) in `i'
    }
    **
    generate Total_in_Group_At_Start = Total_Group - Number_Removed


    Thank you


  • #2
    I'm not sure I understand exactly what you're trying to do. But to the extent I understand it, I agree that -rangestat- alone cannot do the job, but I believe a couple of applications of -rangejoin- can do the heavy lifting here. Does this get what you want?
    Code:
    clear
    input float id str8 group float Start_Date long end_date float Transplanted
    1 "09054" 16807 16839 1
    2 "09054" 16812 16841 1
    3 "09054" 16831 16845 1
    4 "09054" 16838 16848 0
    5 "09054" 16852 16878 1
    6 "09054" 16891 16897 1
    7 "09054" 16898 16900 0
    8 "09054" 16835 16909 1
    9 "09054" 16877 16912 1
    10 "09054" 16908 16916 1
    11 "09952" 16877 16918 0
    12 "09952" 16926 16932 1
    13 "09952" 16940 16946 1
    14 "09952" 16840 16954 1
    15 "09952" 16926 16965 1
    16 "00952" 16908 16966 1
    17 "00952" 16960 16967 1
    18 "00952" 16961 16969 1
    19 "00952" 16968 16979 0
    20 "00952" 16944 16982 1
    21 "29002" 16988 16995 0
    22 "29002" 16975 16999 1
    23 "23002" 16971 17008 1
    24 "23002" 16937 17014 0
    25 "23002" 17017 17022 1
    26 "23002" 17015 17024 1
    27 "23002" 16926 17026 0
    28 "23002" 16924 17032 1
    29 "23002" 16982 17034 1
    30 "23002" 16996 17035 1
    end
    format %d Start_Date
    format %d end_date
    
    //    INTERVALS OVERLAP IF THE START OR END POINT OF ONE
    //    INTERVAL LIES INSIDE THE OTHER
    //    USE RANGEJOIN TWICE, ONCE FOR START, ONCE FOR ENDPOINT
    //    TO PAIR EACH OBSERVATION WITH ALL OTHERS IN GROUP
    //    WITH OVERLAPPING FOLLOW-UP INTERVALS
    tempfile copy
    save `copy'
    rangejoin Start_Date Start_Date end_date using `copy', by(group)
    tempfile holding
    save `holding'
    use `copy', clear
    rangejoin end_date Start_Date end_date using `copy', by(group)
    append using `holding'
    //    SOMETIMES BOTH OCCUR; JUST KEEP ONE OF THESE
    duplicates drop
    //    IF A OVERLAPS WITH B THEN MAKE B OVERLAP WITH A AS WELL
    save "`holding'", replace
    rename (id Start_Date end_date Transplanted) =_V
    rename *_U *
    rename *_V *_U
    append using `holding'
    duplicates drop
    
    //    GET A RUNNING COUNT OF TRANSPLANTS DONE AMONG THE MATCHES
    by id (end_date_U), sort: gen n_transplants = sum(Transplanted_U)
    by id: gen overlap_group_size = _N
    //    FIND FIRST END DATE AMONG OVERLAPS WHERE THE NUMBER OF TRANSPLANTS
    //    DONE IS AT LEAST HALF THE SIZE OF THE OVERLAP GROUP
    by id: egen date_half_transplanted = min(cond(2*n_transplants >= overlap_group_size, ///
        end_date_U, .))
    format date_half_transplanted %td
    Note: In this code, I assume that if a person is transplanted, the date of their transplant is their end_date. (You don't actually say how we are to know when a person is transplanted; I just thought that end_date might be a sensible guess about that.)

    Added: -rangejoin-, like -rangestat-, is from SSC. It was written by Robert Picard.

    Comment


    • #3
      Yes, sorry for the lack of clarity. end_date is the date of transplant if they were transplanted. I will apply this code, it will take me a little time to apply it as I am still a relative begininer, but it looks like it will work.

      Thank you.

      Comment

      Working...
      X