Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • RE: How to generate length of service engagment in days between two dates, different variables, conditional on a 3rd variable value

    Hi STATA users, I have a longitudinal dataset with which I am trying to generate a 'length of service' variable that represents the time period of engagement of an individual with a community health service, after discharge from hospital. I have the following variables:

    - 'ppn' = identifier
    - presentMH_nonfmt = date seen by service
    - start_nonfmtd = date presenting to hospital
    - sepdate_non_fmtd = date separated from hospital
    - I have generated 'AMBMH_7days_after = 1 if presentMH_nonfmt > sepdate_non_fmt & presentMH_nonfmt < sepdate_nn_fmtd +8 (** This is a service indicator stating whether the person attending the community service within the week after d/c from hospital)
    - activity_code1 = string categorical vble which describes service activity - may or may not start with 'open request' but I want to calculate LENGTH OF SERVICE for first day of engagement after hospital (if <8 days) until activity_code1 variable states 'Close Request'.

    NB: The 'start' and 'sepdate' values were merged in from an incident hospital admission dataset - by ppn - so the date of that incident start and sepdate seen is therefore the same for all.

    *** Example data generated by dataex: ssc install dataex

    clear
    Code:
    input long ppn float presentMH_nonfmt float start_nonfmtd float sepdate_nonfmtd float AMBMH_7days_after str11 activity_code1
    
    0071 21808 21807 21808 . "triage"
    0071 21809 21807 21808 1 "counselling"
    0071 21812 21807 21808 1 "Assess"
    0071 21817 21807 21808 . "CareMx"
    0071 21832 21807 21808 . "CareConf"
    0071 21836 21807 21808 . "CareMx"
    0071 21840 21807 21808 . "Counselling"
    0071 21843 21807 21808 . "CareMx"
    0071 21843 21807 21808 . "CloseReq"
    00132 21182 21180 21181 1 "CarePlan"
    00132 21182 21180 21181 1 "CarePlan"
    00132 21183 21180 21181 1 "CarePlan"
    00132 21184 21180 21181 1 "Assess"
    00132 21186 21180 21181 1 "CareMx"
    00132 21186 21180 21181 1 "CloseReq"
    end
    
    format presentMH_nonfmt start_nonfmtd sepdate_nonfmtd %td
    *** I hope this makes sense?

    Thank you in advance from any advice.

    Lisa
    Last edited by Lisa Nicole; 19 Aug 2022, 22:38.

  • #2
    The example data you show is not generated by -dataex-. It doesn't even run when I tried to use it to import your data:
    Code:
    . clear
    
    .
    . input long ppn float presentMH_nonfmt float start_nonfmtd float sepdate_nonfmtd float AMBMH_7days_after int activity_code1
    
                  ppn  present~t  start_n~d  sepdate~d  AMBMH_7~r  activi~1
      1.
    . 0071 triage 21808 21807 21808 .
    'triage' cannot be read as a number
      1. 0071 counselling 21809 21807 21808 1
    'counselling' cannot be read as a number
      1. 0071 Assess 21812 21807 21808 1
    'Assess' cannot be read as a number
      1. 0071 CareMx 21817 21807 21808 .
    'CareMx' cannot be read as a number
      1. 0071 CareConf 21832 21807 21808 .
    
    [output truncated, but continues on in the same way]
    Please post back with actual -dataex- results so that a workable example of your data is available to those who want to help you.

    In addition, I do not understand what you mean by "first day of engagement after hospital". How is that defined in terms of your variables? It isn't clear to me from your description.

    Finally, is it guaranteed that every ppn's data ultimately includes an activity_code_1 of "Close Request"? If not, what is the endpoint for calculating length of service?

    Comment


    • #3
      Hi Clyde Schechter - SO sorry (2nd ever post ) - I am working on a secure remote server that I cannot copy data from - so have typed this out and clearly didn't test it first on my local STATA... I did have the variable names in the wrong order - and didn't specify that the vble 'activity_code1' is string... I have tried to amend by editing the post but on testing it locally it doesn't work either... Will try and fix... in the interim and to answer your questions - the 'first day of engagement after hospital' - is defined by using the variable presentMH_nonfmt (a date) and determining if it is within 7 days of hospital discharge (i.e. variable 'sepdate_nonfmtd' ...

      RE: 'guaranteed' - in an ideal world yes - but these data are not fabulous (compiled from various community mental health services) - so they 'should' have a Close Request' per 'contiguous period of care' - but am not totally sure if all do.

      The dataset has 3,645,033 separations for 56,676 ppns.

      Thank you for looking. I will try and solve my dataex issue.

      Comment


      • #4
        - OK apologies again Clyde Schechter - I do think I have fixed it though... please try the above code block. I had the variable names out of order, didn't have the format correct for the str vble, nor the "". Learnt sthg new. Thanks in advance for any coding advice and my apologies if it is still not clear what I am trying to achieve?
        ​​​​​​​

        Comment


        • #5
          Lisa, it would be helpful if you could tell us in words how you would calculate the variable you need from this data, and what values it would take for the two ppn's in your example data.

          Comment


          • #6
            Maybe?

            Code:
            * Example generated by -dataex-. For more info, type help dataex
            clear
            input long ppn float(presentMH_nonfmt start_nonfmtd sepdate_nonfmtd AMBMH_7days_after) str11 activity_code1
             71 21808 21807 21808 . "triage"     
             71 21809 21807 21808 1 "counselling"
             71 21812 21807 21808 1 "Assess"     
             71 21817 21807 21808 . "CareMx"     
             71 21832 21807 21808 . "CareConf"   
             71 21836 21807 21808 . "CareMx"     
             71 21840 21807 21808 . "Counselling"
             71 21843 21807 21808 . "CareMx"     
             71 21843 21807 21808 . "CloseReq"   
            132 21182 21180 21181 1 "CarePlan"   
            132 21182 21180 21181 1 "CarePlan"   
            132 21183 21180 21181 1 "CarePlan"   
            132 21184 21180 21181 1 "Assess"     
            132 21186 21180 21181 1 "CareMx"     
            132 21186 21180 21181 1 "CloseReq"   
            end
            
            bys ppn (presentMH_nonfmt): g first= sum(AMBMH_7days_after)==1
            bys ppn: egen start= max(cond(first, presentMH_nonfmt,.))
            bys ppn: egen end= max(cond(activity_code1=="CloseReq" & !missing(start), presentMH_nonfmt,.))
            g wanted= end-start+1
            Res.:

            Code:
            . l, sepby(ppn)
            
                 +------------------------------------------------------------------------------------------------+
                 | ppn   presen~t   start_~d   sepdat~d   AMBMH_~r   activity_~1   first   start     end   wanted |
                 |------------------------------------------------------------------------------------------------|
              1. |  71      21808      21807      21808          .        triage       0   21809   21843       35 |
              2. |  71      21809      21807      21808          1   counselling       1   21809   21843       35 |
              3. |  71      21812      21807      21808          1        Assess       0   21809   21843       35 |
              4. |  71      21817      21807      21808          .        CareMx       0   21809   21843       35 |
              5. |  71      21832      21807      21808          .      CareConf       0   21809   21843       35 |
              6. |  71      21836      21807      21808          .        CareMx       0   21809   21843       35 |
              7. |  71      21840      21807      21808          .   Counselling       0   21809   21843       35 |
              8. |  71      21843      21807      21808          .        CareMx       0   21809   21843       35 |
              9. |  71      21843      21807      21808          .      CloseReq       0   21809   21843       35 |
                 |------------------------------------------------------------------------------------------------|
             10. | 132      21182      21180      21181          1      CarePlan       1   21182   21186        5 |
             11. | 132      21182      21180      21181          1      CarePlan       0   21182   21186        5 |
             12. | 132      21183      21180      21181          1      CarePlan       0   21182   21186        5 |
             13. | 132      21184      21180      21181          1        Assess       0   21182   21186        5 |
             14. | 132      21186      21180      21181          1        CareMx       0   21182   21186        5 |
             15. | 132      21186      21180      21181          1      CloseReq       0   21182   21186        5 |
                 +------------------------------------------------------------------------------------------------+

            Comment


            • #7
              See also Section 9 of https://www.stata-journal.com/articl...article=dm0055 for Andrew's egen technique.

              Comment


              • #8
                Thank you Andrew Musau ! - awesome work as I have previously seen from you (and thanks for your additional pointer Nick Cox ) - I have worked this through and I have one question if possible?
                1. For the code:
                Code:
                 
                 bys ppn (presentMH_nonfmt): g first= sum(AMBMH_7days_after)==1
                I can't quite work out how the 'sum' command is selecting only the first (longitudinally) - as opposed to adding (what I usually use 'sum' for) - all counts of AMBMH_7days_after==1 by ppn? It has done what I was trying to achieve but not sure how? Thank you in advance for further clarification.


                THANK YOU!!

                Comment


                • #9
                  the sum function by itself is doing what you expect. The trick is in the boolean evaluation sum() == 1. You can see this more clearly if you break this line up:

                  Code:
                  bys ppn (presentMH_nonfmt): g summed = sum(AMBMH_7days_after)
                  gen first = (summed == 1)
                  
                  bys ppn (presentMH_nonfmt): g first2 = sum(AMBMH_7days_after) == 1
                  
                  assert first == first2
                  Last edited by Hemanshu Kumar; 21 Aug 2022, 00:05.

                  Comment


                  • #10
                    See also https://www.stata.com/support/faqs/d...t-occurrences/ for longer discussion if needed.

                    Comment


                    • #11
                      Thank you so much all! I love STATA !

                      Comment


                      • #12
                        Actually, you love Stata

                        Comment

                        Working...
                        X