Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Keeping multiple dates for different events per individual ID.

    Hi my fellow stata users. I have multiple diagnosis dates for different conditions per individual patient. I am trying to keep the earliest diagnosis date per condition per patient. I have tried the following command -bysort ID ( CCI_date) : keep if _n == 1-
    but I incorrectly keep the earliest diagnosis date overall and not by each condition. I want to end up with several diagnosis dates per patient, but each diagnosis date will correspond to a different condition. I hope that's not too confusing!

    Thanks you in advance. Below is sample data:

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str1 ID byte charlson_category float CCI_date
    "1" 1 14976
    "1" 3 15721
    "1" 3 15734
    "1" 6 21375
    "2" 6 21375
    "2" 4 11688
    "2" 6 20977
    "2" 6 20576
    "2" 6 20233
    "2" 6 20233
    "2" 4 20030
    "3" 4 19710
    "3" 6 20627
    "3" 6 20258
    "4" 6 20258
    "4" 4 21339
    "4" 6 21025
    "4" 4 11688
    "5" 4 19682
    "5" 1 19780
    end
    format %td CCI_date
    label values charlson_category charlson_category
    label def charlson_category 1 "Any malignancy", modify
    label def charlson_category 3 "Chronic pulmonary disease", modify
    label def charlson_category 4 "Congestive heart failure", modify
    label def charlson_category 6 "Diabetes", modify

  • #2
    to keep the first observation by ID and charlson_category,
    Code:
    by ID charlson_category (CCI_date), sort: keep if _n==1

    Comment


    • #3
      Thank you, that has worked perfectly.

      Comment


      • #4
        Hi I have a similar issue. In the sample below. For duplicate ID's where they have both Dementia_Type of both 'Alzheimer's' and 'unspecified'. I want to prioritise and keep the Alzheimer's disease regardless of diagnosis date. I.e. if for the same ID a diagnosis of Alzheimer's was 5 years after a diagnosis of Unspecified dementia I want to keep the Alzheimer's defined diagnosis date only for that ID.

        Code:
        * Example generated by -dataex-. For more info, type help dataex
        clear
        input float ID int Diagnosis_Date long Dementia_Type
         1 13808 6
         2 15299 1
         3 20792 6
         3 20773 1
         4 15594 6
         4 18176 1
         5 20076 6
         6 14748 1
         7 13082 1
         7 15461 6
         8 18120 1
         8 14243 6
         9 20881 1
         9  9313 6
        10 18652 6
        10 20527 1
        11 15932 6
        12 12799 1
        13 15257 6
        14 17224 6
        15 11299 6
        end
        format %td Diagnosis_Date
        label values Dementia_Type subtype_n
        label def subtype_n 1 "Alzheimers Disease", modify
        label def subtype_n 6 "Unspecified Dementia", modify

        Comment


        • #5
          Code:
          by ID (Dementia_Type), sort: drop if Dementia_Type==6 & Dementia_Type[1]!=Dementia_Type[_N]

          Comment


          • #6
            Thank you again!

            Comment

            Working...
            X