Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Help with -stsplit- on dates

    Colleagues,

    I have a survival dataset with multiple episodes per patient according to their exposure to a certain treatment, and I want to split the episodes further at 31st of Dec each year to develop some end of year summaries.

    I have tried to follow https://www.stata.com/statalist/arch.../msg00702.html

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str9 mrm_patient float dead int(rxdate end_recorddate) float exitdate
    "P10000697" 0 14368 14497 15627
    "P10000697" 0 14497 15627 15627
    "P10004363" 0 14460 14544 14797
    "P10004363" 0 14544 14797 14797
    "P10005983" 1 14116 14176 15705
    "P10015883" 0 15301 15308 15705
    "P10015883" 0 15308 15705 15705
    "P10025513" 0 15476 15487 15705
    "P10025513" 0 15487 15491 15705
    "P10025513" 0 15491 15705 15705
    "P1042685"  0 14593 14635 15705
    "P1042685"  0 14635 15705 15705
    "P1042774"  0 14667 14686 15705
    "P1042774"  1 14686 15394 15705
    "P1042954"  0 14557 14662 15705
    "P1042954"  0 14662 15705 15705
    "P1042955"  0 14547 15705 15705
    "P1043134"  0 14630 14678 15705
    "P1043134"  0 14678 15705 15705
    "P1043597"  0 20146 20662 21184
    "P1043597"  0 20662 21184 21184
    "P1043608"  0 20003 20012 20880
    "P1043608"  0 20012 20125 20880
    "P1043608"  0 20125 20290 20880
    "P1043608"  0 20290 20313 20880
    "P1043608"  0 20313 20880 20880
    "P1043619"  0 20475 20506 21184
    "P1043619"  0 20506 20534 21184
    "P1043619"  0 20534 20698 21184
    "P1043619"  0 20698 21184 21184
    end
    format %d rxdate
    format %d end_recorddate
    format %d exitdate
    This is now -stset- thus:

    Code:
    stset end_recorddate, id(mrm_patient) failure(dead==1) exit(exitdate) origin(time rxdate) scale(365.25)
    I now try to - stsplit- the dataset:

    Code:
    stsplit year, at(14244 14609 14975)
    and get

    HTML Code:
    . stsplit year, at(14244 14609 14975)
    (no new episodes generated)
    Clearly this is my error, but I am not seeing where. Can someone advise please?

    Thanks

    MM

  • #2
    In your stset you have specified rxdate as the time origin. This means time is now "time since rxdate" and the cutpoints in stsplit must be on that timescale. You have also used scale() to change the time units from days to years. Your -stsplit- is asking Stata to split the data where the first period is from rxdate to rxdate plus 14244 years. I'm guessing you don't have anyone at risk for more than 14244 years after rxdate so no splitting is done.

    I suggest you do the following:

    Code:
     
     stset end_recorddate, id(mrm_patient) failure(dead==1) exit(exitdate)  
     stsplit year, at(14244 14609 14975)
    That is, first stset the data with calendar time as the timescale. You can then do another stset to get the data on the timescale you want for the analysis (i.e., your original timescale).

    It's always useful to look at your data after stset and stsplit:

    Code:
    list mrm_patient rxdate end_recorddate exitdate _t0 _t _d
    Thanks for providing your data. I apologise I didn't provide a complete solution using your data. I had a few minutes before a day of meetings starts and hope this is helpful.

    Comment


    • #3
      Thanks for this, as usual in retrospect it looks obvious. Thanks for the solution, it works well :-)

      Comment

      Working...
      X