Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Defining multiple events within a definite period in an iterative way.

    Hi there,

    I have a dataset with patients having more than one visit. I need to define an episode and analyse the data based on the episode time frame. For this I had to create an episode variable based on the visiting time. For example if a person has multiple visits within 30 days that is considered as episode 1. If this person has another visit after the first 30 days that is considered episode 2 and so on and so forth.
    I am doing it using this code below but I am wondering if there is a iterative way of doing it faster:

    In my code below I identify the earliest visit date (visit_date) for each patient (patid) and define the lag period (diff_time1) between every other date and the first_time_visited). Then I include in episode 1 all those visits that happened within 30 days. The next step is I identify the earliest time for the second visit and similarly define the difference of every other visit with the earliest second time only for the visits that were not included in episode 1. Again every visit taken place within 30 days is considered so episode. I continue like this until I define every other episode.

    Would appreciate any suggestion on making this faster.
    *Define episode 1

    egen first_time_visited=min(visit_date), by(patid)
    gen diff_time1=.
    by patid: replace diff_time=visit_date-first_time_visited
    gen episode=.
    replace episode=1 if diff_time1<=30

     
    *Define episode 2


    egen second_time_visited=min(visit_date) if episode=., by(patid)
    gen diff_time2=.
    by patid: replace diff_time2=visit_date-second_time_visited
    replace episode=2 if diff_time2<=30


    *Define episode 3

    egen third_time_visited=min(visit_date) if episode=., by(patid)
    gen diff_time3=.
    by patid: replace diff_time3=visit_date-third_time_visited
    replace episode=3 if diff_time3<=30

     
    Thank you so much!


    Adriana

     






  • #2
    I'm pretty sure you can do this more simply. But I also don't understand exactly how you are defining episode. If a patient has a visit on day 1, and another on day 30 those are both part of episode 1. Now if that patient visits again on day 35, is that still episode 1 (because it is within 30 days of the day 30 visit) or is that in episode 2?

    Comment


    • #3
      Clinical or medical meaning presumably should come first here, but why not record the time since the previous visit and let the start of a new episode be defined by a sufficiently large gap?

      Comment


      • #4
        To respond to Clyde's question after the first 30 days I should start counting the second episode. So I have to reset the time and the person visited on day 35 would be counted as episode 2.
        Hope it is clear.

        Thanks for helping,

        Adriana

        Comment


        • #5
          So you're really just aggregating visits into 30 day blocks. So, I assume you have two variables: patient_id and visit_date, the latter being a Stata internal format date variable (not a string variable that humans can read as dates, and not a numeric variable that takes on values like 08232016 or the like.) If you are not familiar with Stata internal format date variables, do not proceed until you learn about them by reading -help datetime-.

          Code:
          by patient_id (visit_date), sort: gen block30 = floor((visit_date-visit_date[1])/30)
          by patient_id (block30), sort: gen episode = sum(block30 != block30[_n-1])
          The variable episode will give you consecutively numbered groups of visit in 30 day intervals starting from 1.

          Caution: not tested, beware of typos.

          Comment


          • #6
            Thank you so much! It really works perfectly.

            Adriana

            Comment


            • #7
              Hi again. Actually when I looked deep into the code I found a problem. The calculation of the block is always based on the very first visit date. That is not what I was looking for.
              If the block is greater than 30 days then I am interested to calculate timing for the block based on the current visit day after 30 days. To bring an example if the person was visited on December 19 2012 (episode 1), December 27 2012 (episode 1), April 3 2013 (episode 2) and April 29 2013 (episode 2) the timing of the block for the last two visits should be counted based on April 3 2013. And if the person had another visit let say in June 5 2013 (episode 3) the timing should start based on June 5 2013. If the episode is based on very first day the episode on April 29 2013 is missclasified as episode 3 instead of episode 2. I am not sure if it is possible to tweak the code to make that .

              Sorry for bothering again.

              Adriana

              Comment

              Working...
              X