Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Calculate time spent within a phase

    Dear Statalist,

    Code:
    clear
    input int(ID date) float phase
    1 17530 1
    1 18756 1
    1 19635 1
    1 19853 2
    1 21665 1
    2 17535 1
    2 18765 1
    2 20111 2
    2 22365 1
    2 22789 2
    3 18222 1
    3 19356 2
    3 20365 2
    3 21489 3
    3 21956 2
    3 22050 2
    4 15985 1
    4 16896 1
    4 18888 2
    4 19653 2
    4 20756 2
    4 21666 1
    4 22689 2
    end
    format %tdDD/NN/CCYY date

    If I want to calculate the duration a person (using the ID in my example) spends within a phase (calculated from the start date of the current phase to the start date of the next phase), how can I do it?

    I am very thankful for any advice on my questions

  • #2
    This may help. You may want to use more rules than are used here. I can only guess that the lengths reported as 0 and -1 need to be replaced with missing because there is no information on when the last phase ended, unless you have some further definition.

    Consider whether your definition of length needs to differ from that here. If a phase began and ended on a particular day, it would be reported as length 0 by the rules here; if it ended the day after it started, it would be reported as length 1; and so on. Many people would regard such phases as of length 1; 2; and so on.

    Code:
    clear
    input int(ID date) float phase
    1 17530 1
    1 18756 1
    1 19635 1
    1 19853 2
    1 21665 1
    2 17535 1
    2 18765 1
    2 20111 2
    2 22365 1
    2 22789 2
    3 18222 1
    3 19356 2
    3 20365 2
    3 21489 3
    3 21956 2
    3 22050 2
    4 15985 1
    4 16896 1
    4 18888 2
    4 19653 2
    4 20756 2
    4 21666 1
    4 22689 2
    end
    format %tdDD/NN/CCYY date
    
    bysort ID (date) : gen start = date if phase != phase[_n-1]
    format %tdDD/NN/CCYY start 
    by ID : gen counter = sum(phase != phase[_n-1])
    bysort ID counter: replace start = start[_n-1] if missing(start)
    bysort ID (date) : gen length = max(-1, start[_n+1] - start)  
    bysort ID counter (length) : replace length = length[_N]
    
    sort ID date
    
    list, sepby(ID counter)
    
         +---------------------------------------------------------+
         | ID         date   phase        start   counter   length |
         |---------------------------------------------------------|
      1. |  1   30/12/2007       1   30/12/2007         1     2323 |
      2. |  1   09/05/2011       1   30/12/2007         1     2323 |
      3. |  1   04/10/2013       1   30/12/2007         1     2323 |
         |---------------------------------------------------------|
      4. |  1   10/05/2014       2   10/05/2014         2     1812 |
         |---------------------------------------------------------|
      5. |  1   26/04/2019       1   26/04/2019         3       -1 |
         |---------------------------------------------------------|
      6. |  2   04/01/2008       1   04/01/2008         1     2576 |
      7. |  2   18/05/2011       1   04/01/2008         1     2576 |
         |---------------------------------------------------------|
      8. |  2   23/01/2015       2   23/01/2015         2     2254 |
         |---------------------------------------------------------|
      9. |  2   26/03/2021       1   26/03/2021         3      424 |
         |---------------------------------------------------------|
     10. |  2   24/05/2022       2   24/05/2022         4       -1 |
         |---------------------------------------------------------|
     11. |  3   21/11/2009       1   21/11/2009         1     1134 |
         |---------------------------------------------------------|
     12. |  3   29/12/2012       2   29/12/2012         2     2133 |
     13. |  3   04/10/2015       2   29/12/2012         2     2133 |
         |---------------------------------------------------------|
     14. |  3   01/11/2018       3   01/11/2018         3      467 |
         |---------------------------------------------------------|
     15. |  3   11/02/2020       2   11/02/2020         4        0 |
     16. |  3   15/05/2020       2   11/02/2020         4        0 |
         |---------------------------------------------------------|
     17. |  4   07/10/2003       1   07/10/2003         1     2903 |
     18. |  4   05/04/2006       1   07/10/2003         1     2903 |
         |---------------------------------------------------------|
     19. |  4   18/09/2011       2   18/09/2011         2     2778 |
     20. |  4   22/10/2013       2   18/09/2011         2     2778 |
     21. |  4   29/10/2016       2   18/09/2011         2     2778 |
         |---------------------------------------------------------|
     22. |  4   27/04/2019       1   27/04/2019         3     1023 |
         |---------------------------------------------------------|
     23. |  4   13/02/2022       2   13/02/2022         4       -1 |
         +---------------------------------------------------------+

    Comment


    • #3
      Hi Nick, Brilliant answer. Thanks so much for your help.

      Comment

      Working...
      X