Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Identify events as belonging to the same spell when the duration of consecutive events does not overlap

    Dear Statalist users,

    I am working with a longitudinal dataset made up of events with a start and end date in a file with "long" format.These "events" refer to the duration of protection measures in favour of victims of intimate partner violence (IPV). Each protection measure is a row. I am working in Stata 16.

    Each of these protection measures has a start date and an end date. Measures are ordered in the dataset by their starting date. That is, the first protection measure is the one with the earliest starting date; the second measure is the one that starts the second; etc. The duration period of each protection measure may overlap partly or completely with other protection measures (consecutive or not). I want to create grand spells, that is, periods during which the the duration of consecutive protection measures overlaps with the immediately previous starting protection measure or with any previous starting protection measure. The outcome of this process would be spells with duration defined by the date of the first starting protection measure and by the date of the last ending protection measure as far as there is an overlap between these measures or between any intermediate measure so that there is no period during which there are no protection measures in force. Whenever a protection measure ends and no other measure is in force, this would mark the end of a grand spell of protection measures.

    My problem is with protection measures whose duration overlaps but which do not appear consecutively in the dataset. My question is: how can I identify measures belonging to the same spell if their duration overlaps but they are not consecutive?

    I hope the example table below helps to clarify my point. Protection measures #1 to #4 (see table) are part of the same spell (spell 1). From the start date of measure #1 (earliest starting measure) to the end date of measure #4 (latest ending mesure) there is no period during which there are no measures in force. However, measure #5 would be part of a different spell (spell 2). I am able to identify in Stata whether consecutive protection measures overlap or not (e.g., whether measure #2 and #1 overlap; whether measure #3 and #2 overlap) but not whether non-consecutive measures do so (e.g., protection measure #1 and #3). Which syntax should I use to refer protection measure #3 to protection measure #1? Note that the protection measure needs not necesarily to be the first one in the dataset (so _n=1 does not always work).

    protection_measure start_date1 end_date1 overlaps_with_previous_measure spell
    1 1/1/2004 3/7/2015 - 1
    2 21/2/2011 19/2/2013 1 (Yes) 1
    3 3/8/2013 18/3/2015 0 (No) 1
    4 8/8/2013 2/1/2016 1 (Yes) 1
    5 1/1/2017 3/5/2017 0 (No) 2

    The code to generate the first three columns of the table is the following. The last two columns were manually created by me directly in the table.

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float(protection_measure start_date1 end_date1)
    1 16071 20272
    2 18679 19408
    3 19573 20165
    4 19578 20455
    5 20820 20942
    end
    format %tdnn/dd/CCYY start_date1
    format %tdnn/dd/CCYY end_date1
    The code for creating an indicator of whether immediately consecutive measures overlap is the following

    Code:
    * The start date of a protection measure overlaps with the period in force of the previous measure
    
    gen overlaps_with_previous_measure=1 if (start_date1<=end_date1[_n-1]) & _n!=[1]
    
    * The start date of a protection measure does not overlap with the period in force of the previous measure
    
    replace overlaps_with_previous_measure=0 if start_date1>end_date1[_n-1] & _n!=[1]
    Thank you very much for your attention.

  • #2
    Code:
    isid protection_measure, sort
    reshape long @date1, i(protection_measure) j(event) string
    gsort date1 -event
    gen int depth = sum((event == "start_") - (event == "end_"))
    gen int spell = sum(depth > 0 & inlist(depth[_n-1], 0, .))
    collapse (min) begin = date1 (max) end = date1, by(spell)

    Comment


    • #3
      Dear Clyde,

      Many thanks for your answer. I have a follow-up question about your code: what does the following code exactly does?
      Code:
       gen int depth = sum((event == "start_") - (event == "end_"))

      Comment


      • #4
        It creates a variable, depth, that counts the total number of protections that are in place as of the date in the observation.

        Comment


        • #5
          Dear Clyde,

          Thank you very much for the additional clarifications and for your help in general. The code you suggested me to use works and I was able to achieve exactly what I wanted.

          Comment

          Working...
          X