Dear Statalist users,
I am working with a longitudinal dataset made up of events with a start and end date in a file with "long" format.These "events" refer to the duration of protection measures in favour of victims of intimate partner violence (IPV). Each protection measure is a row. I am working in Stata 16.
Each of these protection measures has a start date and an end date. Measures are ordered in the dataset by their starting date. That is, the first protection measure is the one with the earliest starting date; the second measure is the one that starts the second; etc. The duration period of each protection measure may overlap partly or completely with other protection measures (consecutive or not). I want to create grand spells, that is, periods during which the the duration of consecutive protection measures overlaps with the immediately previous starting protection measure or with any previous starting protection measure. The outcome of this process would be spells with duration defined by the date of the first starting protection measure and by the date of the last ending protection measure as far as there is an overlap between these measures or between any intermediate measure so that there is no period during which there are no protection measures in force. Whenever a protection measure ends and no other measure is in force, this would mark the end of a grand spell of protection measures.
My problem is with protection measures whose duration overlaps but which do not appear consecutively in the dataset. My question is: how can I identify measures belonging to the same spell if their duration overlaps but they are not consecutive?
I hope the example table below helps to clarify my point. Protection measures #1 to #4 (see table) are part of the same spell (spell 1). From the start date of measure #1 (earliest starting measure) to the end date of measure #4 (latest ending mesure) there is no period during which there are no measures in force. However, measure #5 would be part of a different spell (spell 2). I am able to identify in Stata whether consecutive protection measures overlap or not (e.g., whether measure #2 and #1 overlap; whether measure #3 and #2 overlap) but not whether non-consecutive measures do so (e.g., protection measure #1 and #3). Which syntax should I use to refer protection measure #3 to protection measure #1? Note that the protection measure needs not necesarily to be the first one in the dataset (so _n=1 does not always work).
The code to generate the first three columns of the table is the following. The last two columns were manually created by me directly in the table.
The code for creating an indicator of whether immediately consecutive measures overlap is the following
Thank you very much for your attention.
I am working with a longitudinal dataset made up of events with a start and end date in a file with "long" format.These "events" refer to the duration of protection measures in favour of victims of intimate partner violence (IPV). Each protection measure is a row. I am working in Stata 16.
Each of these protection measures has a start date and an end date. Measures are ordered in the dataset by their starting date. That is, the first protection measure is the one with the earliest starting date; the second measure is the one that starts the second; etc. The duration period of each protection measure may overlap partly or completely with other protection measures (consecutive or not). I want to create grand spells, that is, periods during which the the duration of consecutive protection measures overlaps with the immediately previous starting protection measure or with any previous starting protection measure. The outcome of this process would be spells with duration defined by the date of the first starting protection measure and by the date of the last ending protection measure as far as there is an overlap between these measures or between any intermediate measure so that there is no period during which there are no protection measures in force. Whenever a protection measure ends and no other measure is in force, this would mark the end of a grand spell of protection measures.
My problem is with protection measures whose duration overlaps but which do not appear consecutively in the dataset. My question is: how can I identify measures belonging to the same spell if their duration overlaps but they are not consecutive?
I hope the example table below helps to clarify my point. Protection measures #1 to #4 (see table) are part of the same spell (spell 1). From the start date of measure #1 (earliest starting measure) to the end date of measure #4 (latest ending mesure) there is no period during which there are no measures in force. However, measure #5 would be part of a different spell (spell 2). I am able to identify in Stata whether consecutive protection measures overlap or not (e.g., whether measure #2 and #1 overlap; whether measure #3 and #2 overlap) but not whether non-consecutive measures do so (e.g., protection measure #1 and #3). Which syntax should I use to refer protection measure #3 to protection measure #1? Note that the protection measure needs not necesarily to be the first one in the dataset (so _n=1 does not always work).
protection_measure | start_date1 | end_date1 | overlaps_with_previous_measure | spell |
1 | 1/1/2004 | 3/7/2015 | - | 1 |
2 | 21/2/2011 | 19/2/2013 | 1 (Yes) | 1 |
3 | 3/8/2013 | 18/3/2015 | 0 (No) | 1 |
4 | 8/8/2013 | 2/1/2016 | 1 (Yes) | 1 |
5 | 1/1/2017 | 3/5/2017 | 0 (No) | 2 |
The code to generate the first three columns of the table is the following. The last two columns were manually created by me directly in the table.
Code:
* Example generated by -dataex-. For more info, type help dataex clear input float(protection_measure start_date1 end_date1) 1 16071 20272 2 18679 19408 3 19573 20165 4 19578 20455 5 20820 20942 end format %tdnn/dd/CCYY start_date1 format %tdnn/dd/CCYY end_date1
Code:
* The start date of a protection measure overlaps with the period in force of the previous measure gen overlaps_with_previous_measure=1 if (start_date1<=end_date1[_n-1]) & _n!=[1] * The start date of a protection measure does not overlap with the period in force of the previous measure replace overlaps_with_previous_measure=0 if start_date1>end_date1[_n-1] & _n!=[1]
Comment