Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Counting dates within subgroups.

    I have columns that represent patient ID, the line of chemotherapy treatment and the start and end date each line. I've also got a column that lists dates for certain events of interest, as merged on the patient ID.

    Is there any easy way of creating the variable that provides a count of each time an event of interest occurred during a treatment line?

    This is made problematic by the fact I've had to merge event dates on patient ID, such that the event dates are listed for each patient ID and not for each line of a patient's treatment.

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float(id line start_date_of_line end_date_of_line event_date)
     8 1 19418 19418 19508
     9 1 19409 19458     .
     9 2 19479 19549     .
    10 1 19395 19395     .
    11 1 19387 19492 19943
    11 1 19387 19492 20026
    12 1 19439 19439 19367
    13 1 19407 19407     .
    15 1 19402 19542     .
    16 1 19410 19486     .
    17 1 19907 19970 19472
    17 1 19907 19970 19798
    17 1 19907 19970 20113
    17 1 19907 19970 20352
    18 1 19907 19949 19918
    18 2 19970 19970 19917
    18 2 19970 19970 19958
    18 2 19970 19970 19959
    18 2 19970 19970 20006
    18 2 19970 19970 20079
    18 2 19970 19970 20120
    19 1 19404 19470 19688
    20 1 19380 19422 19490
    21 1 19794 19906 19411
    21 1 19794 19906 19458
    27 1 19443 19529 19841
    27 2 19694 19799 19841
    28 1 19373 19453 19401
    29 1 19417 19417 19620
    29 1 19417 19417 19734
    30 1 19395 19395     .
    40 1 19445 19488 19436
    40 1 19445 19488 19471
    40 1 19445 19488 19504
    41 1 19474 19537 19482
    42 1 19438 19515 19619
    42 2 19570 19570 19619
    43 1 19435 19610 19505
    44 1 19436 19688 19722
    44 1 19436 19688 19852
    44 2 19746 19792 19915
    45 1 19442 19485 19743
    end
    format %td start_date_of_line
    format %td end_date_of_line
    format %td event_date
    Last edited by Craig Knott; 03 Sep 2018, 03:12.

  • #2
    I don't think I understand this, but something like

    Code:
    egen whatever = total(inrange(attendance_date, start, end)), by(id line)
    may give you some ideas on technique.

    Comment


    • #3
      Originally posted by Nick Cox View Post
      I don't think I understand this, but something like

      Code:
      egen whatever = total(inrange(attendance_date, start, end)), by(id line)
      may give you some ideas on technique.
      Thanks for the suggestion, Nick. The code is comparable to what I've written out currently, with both being limited by the fact that, in having merged event_date on the patient ID, the full list of event dates is not repeated within each patient line. For example, the merge on patient ID may list an event date of interest in the range of treatment line 2, but if it is only listed on the row for treatment line 1, it isn't counted.

      I'm wondering whether there's a neat way of taking all distinct event dates for a patient ID and repeating them across each treatment line.
      Last edited by Craig Knott; 03 Sep 2018, 03:46.

      Comment


      • #4
        I've resolved this issue by restructuring the data I was merging. Phew.

        Comment

        Working...
        X