Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • calculating or identifying start/stop dates according to a rule with consecutive observations

    Hi, this is a snap shot of my data set (I have used dataex but I'm not sure if the result is correct). I have an id number, start and stop date for receipt of social benefits. If the stop date is missing it means they are still receiving the benefit. My objective is to create/calculate new runs for each individual where current runs that are 3 months or less apart are merged into one run. for ex for individual 1, 9 and 100 I would like the following result:

    Anyone who know how I can achieve that?
    ID start stop first_start last_stop
    1 2012m2 2012m2 2012m2 2014m6
    1 2012m4 2012m6 2012m2 2014m6
    1 2012m8 2013m6 2012m2 2014m6
    1 2013m9 2013m10 2012m2 2014m6
    1 2013m12 2014m6 2012m2 2014m6
    1 2018m1 . 2018m1 .
    9 2016m11 . 2016m11 .
    100 2013m3 2013m11 2013m3 2016m3
    100 2014m1 2014m8 2013m3 2016m3
    100 2014m10 2014m10 2013m3 2016m3
    100 2014m12 2015m2 2013m3 2016m3
    100 2015m4 2015m8 2013m3 2016m3
    100 2015m10 2015m10 2013m3 2016m3
    100 2015m12 2016m3 2013m3 2016m3
    100 2016m8 2016m9 2016m8 2017m1
    100 2016m12 2017m1 2016m8 2017m1
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float(id start stop)
      1 625 625
      1 627 629
      1 631 641
      1 644 645
      1 647 653
      1 696   .
      9 682   .
     13 695   .
     17 694 694
     25 662 675
     25 677 678
     25 682   .
     29 634 680
     33 643 661
     33 690   .
     37 657 704
     37 706 710
     41 679   .
     45 707   .
     49 683   .
     53 670   .
     57 696   .
     61 687 693
     61 696   .
     65 634 680
     69 636 649
     73 628 658
     73 676 690
     73 702   .
     81 637 705
     85 638 639
     85 652 703
     89 633 639
     89 641 665
     93 707   .
     97 694   .
    100 638 646
    100 648 655
    100 657 657
    100 659 661
    100 663 667
    100 669 669
    100 671 674
    100 679 680
    100 683 684
    end
    format %tm start
    format %tm stop
    Last edited by Saliha Amrani; 09 Dec 2019, 03:10.

  • #2
    The problem or question you have is not yet clear to me.

    Comment


    • #3
      I believe this does what you're looking for. Using the example data above:
      Code:
      bysort id (start): gen diff = start - stop[_n-1]
      gen first_start = start if diff > 3
      replace first_start = first_start[_n-1] if missing(first_start)
      bysort id first_start (start): gen first_stop = stop[_N]
      format first_* %tm
      list if inlist(id, 1, 9, 100), noobs sepby(id) abbr(14)
      
      
        +-----------------------------------------------------------+
        |  id     start      stop   diff   first_start   first_stop |
        |-----------------------------------------------------------|
        |   1    2012m2    2012m2      .        2012m2       2014m6 |
        |   1    2012m4    2012m6      2        2012m2       2014m6 |
        |   1    2012m8    2013m6      2        2012m2       2014m6 |
        |   1    2013m9   2013m10      3        2012m2       2014m6 |
        |   1   2013m12    2014m6      2        2012m2       2014m6 |
        |   1    2018m1         .     43        2018m1            . |
        |-----------------------------------------------------------|
        |   9   2016m11         .      .       2016m11            . |
        |-----------------------------------------------------------|
        | 100    2013m3   2013m11      .        2013m3       2016m3 |
        | 100    2014m1    2014m8      2        2013m3       2016m3 |
        | 100   2014m10   2014m10      2        2013m3       2016m3 |
        | 100   2014m12    2015m2      2        2013m3       2016m3 |
        | 100    2015m4    2015m8      2        2013m3       2016m3 |
        | 100   2015m10   2015m10      2        2013m3       2016m3 |
        | 100   2015m12    2016m3      2        2013m3       2016m3 |
        | 100    2016m8    2016m9      5        2016m8       2017m1 |
        | 100   2016m12    2017m1      3        2016m8       2017m1 |
        +-----------------------------------------------------------+
      Edit - first_stop should be named last_stop. I wasn't paying attention there.
      Last edited by Wouter Wakker; 09 Dec 2019, 04:02.

      Comment

      Working...
      X