Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Merging 2 datasets on overlapping period of time

    I have a question related to merging two datasets on an overlapping period of time. I have found some posts with similar questions (for example, the post
    HTML Code:
    https://www.statalist.org/forums/forum/general-stata-discussion/general/1379199-sum-overlapping-days-based-on-dates-of-all-rows
    ), however, my situation is a bit different than theirs. In addition, I have a pretty large dataset, so I don't know if command like 'expand' will work out on my PC.

    Basically, I have one dataset with ID, starting time A, ending time A, and another dataset with ID, starting time B, ending time B. I need to combine these two datasets on the overlapping period of time and generate a dummy variable that equals one if there is any overlapping period for the same ID. Here is an example:

    Dataset A

    Code:
    clear
    input byte id float(startdateA enddateA)
    1 16071 16437
    2 16406 16496
    3 16437 16527
    end
    format %td startdateA
    format %td enddateA

    Dataset B

    Code:
    clear
    input byte id float(startdateB enddateB)
    1 16050 16430
    2 17000 17200
    3 16400 16550
    end
    format %td startdateB
    format %td enddateB

    What I want to end up with:

    Code:
    clear
    input byte id float(startdateA enddateA startdateB enddateB overlap)
    1 16071 16437 16050 16430 1
    2 16406 16496 . .  0
    3 16437 16527 16400 16550 1
    end
    format %td startdateA startdateB
    format %td enddateA enddateB



    Thank you!

  • #2
    I think you should be able to do this with rangejoin (SSC) in three steps.
    1. Determine if start date B is between start & end A (first rangejoin)
    2. Determine if end date B is between start & end A (second rangejoin)
    3. There is overlap if either or both are in the interval. You can even determine the overlap quite easily as you know if its the start or end date that's in the initial range.

    Comment


    • #3
      Hi Jesse,

      Thank you for your post. That is exactly what I am doing now. However, I was not very sure about the methode. So now you confirmed that this solution could make the trick. Thanks!

      Comment

      Working...
      X