Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Help generating days from initial visit

    Hi Forum,

    I have a dataset on individuals treated in a medical clinic. I would like to create a variable that counts the number of days between an encounter (aka visit) and the first visit for this course of treatment. This value is in days and will be stored in the variable below titled 'days_from_initial'.

    The challenge is how we are defining the initial encounter. We have created the variable 'visit_num' to help us here. This variable numbers the visits 1 - _n. The first visit occurs if it's a new body region and the patients first visit. If the days between visits exceeded 90 days, we start counting over at 1 b/c this is, in effect' a new course of treatment. If the region changed (e.g. same patient had original knee complaint, but this shifted to neck) the visit_num started over at 1.

    So all I need to do is count from visit_date[_n] - visit_date[1] - but I'm having a hard time with it. I need it to count days from most recent visit_num = 1. The sample dataset below shows it better than I could explain. fakid #1 has 2 visits for Cervical, then 10 for Lumbar than 2 for a new episode of Lumbar. This makes 3 episodes of care and I should have the days_from_initial start counting over at each visit_num = 1.

    Thanks for any help!
    Ben

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float(fakid visit_date visit_num) long region float(days_btwn_visits days_from_initial)
    1 23722  1 2  0 .
    1 23757  2 2 35 .
    1 23474  1 4  0 .
    1 23476  2 4  2 .
    1 23478  3 4  2 .
    1 23481  4 4  3 .
    1 23489  5 4  8 .
    1 23512  6 4 23 .
    1 23524  7 4 12 .
    1 23573  8 4 49 .
    1 23581  9 4  8 .
    1 23583 10 4  2 .
    1 23798  1 4  0 .
    1 23839  2 4 41 .
    2 23133  1 9  0 .
    2 23490  1 9  0 .
    2 23506  2 9 16 .
    2 23553  3 9 47 .
    2 23595  5 9 42 .
    2 23671  6 9 76 .
    2 23715  7 9 44 .
    2 23841  1 9  0 .
    3 23432  1 9  0 .
    3 23754  1 9  0 .
    3 23782  2 9 28 .
    4 23385  1 4  0 .
    4 23510  1 4  0 .
    4 23547  2 4 37 .
    end
    format %tdnn/dd/ccYY visit_date
    label values region region_lab
    label def region_lab 2 "Cervical", modify
    label def region_lab 4 "Lumbar", modify
    label def region_lab 9 "Knee", modify

  • #2
    Code:
    gen `c(obs_t)' obs_no = _n
    by fakid (obs_no), sort: gen byte episode = sum(visit_num == 1)
    by fakid episode (visit_num), sort: replace days_from_initial = visit_date - visit_date[1]
    will do this.

    I'm curious how you got to the data set you have in the first place. I ask because creating the variable visit_num that restarts at 1 with change of region, or after 90 days, is more complicated than this, and, at least in the way I would have created the variables you are starting with, I would have had to create the variable episode that is created in the above code along the way, and that is the key to getting days_from_initial correct.

    Comment


    • #3
      Thanks so much, Clyde. This worked great. Well, I was able to create the visit_num and days_between visits with no problem. Then I restarted the numbering if days between visits > 90. I just couldn't figure out how to get the counting from the correct start visit. I wasn't aware of this sum(visit_num==1) trick, which is quite helpful!

      Comment

      Working...
      X