Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Deriving new variable by summing a variable in a rolling time window

    Hi Everyone,

    I have a dataset that looks roughly like this:


    input str8 month_year studyid edvisits

    "AUG_08" 1 0
    "SEP_08" 1 2
    "OCT_08" 1 1
    "NOV_08" 1 0
    "DEC_08" 1 0
    "AUG_08" 2 0
    "SEP_08" 2 0
    "OCT_08" 2 1
    "NOV_08" 2 0
    "DEC_08" 2 0
    "AUG_08" 3 1
    "SEP_08" 3 0
    "OCT_08" 3 2
    "NOV_08" 3 0
    "DEC_08" 3 0
    "AUG_08" 4 1
    "SEP_08" 4 0
    "OCT_08" 4 0
    "NOV_08" 4 1
    "DEC_08" 4 1

    What I need to do is generate a new variable called "repeat_user" that =1 if a person has 3 or more edvisits in a 3 month period and =0 if they don't.

    so the "want" dataset would look like this:

    input str8 month_year studyid edvisits repeat_user

    'AUG_08' 1 0 1
    'SEP_08' 1 2 1
    'OCT_08' 1 1 1
    'NOV_08' 1 0 1
    'DEC_08' 1 0 1
    'AUG_08' 2 0 0
    'SEP_08' 2 0 0
    'OCT_08' 2 1 0
    'NOV_08' 2 0 0
    'DEC_08' 2 0 0
    'AUG_08' 3 1 1
    'SEP_08' 3 0 1
    'OCT_08' 3 2 1
    'NOV_08' 3 0 1
    'DEC_08' 3 0 1
    'AUG_08' 4 1 0
    'SEP_08' 4 0 0
    'OCT_08' 4 0 0
    'NOV_08' 4 1 0
    'DEC_08' 4 1 0

    For patient 1 repeat_user = 1 because they has 3 visits in 2 months
    For patient 2 repeat_user = 0 because they only had 1 visit
    For patient 3 repeat_user = 1 because they had 3 visits in 3 months
    For patient 4 repeat_user = 0 because even though they had 3 visits, they were not within a three month period

    Month_year is currently formatted as a proper date variable.

    Any help would be much appreciated!

    Thanks so much.

    Mike







    Last edited by Mike Reid; 24 Oct 2019, 13:29.

  • #2
    This is wrong: there is no sense in which month_year is a proper Stata date variable, if only because all such variables are numeric.

    But you can make progress. This approach uses rangestat from SSC. The assumption is that all dates are this century. If that's wrong it can be fixed.


    Code:
    clear
    input str8 month_year studyid edvisits
    "AUG_08" 1 0
    "SEP_08" 1 2
    "OCT_08" 1 1
    "NOV_08" 1 0
    "DEC_08" 1 0
    "AUG_08" 2 0
    "SEP_08" 2 0
    "OCT_08" 2 1
    "NOV_08" 2 0
    "DEC_08" 2 0
    "AUG_08" 3 1
    "SEP_08" 3 0
    "OCT_08" 3 2
    "NOV_08" 3 0
    "DEC_08" 3 0
    "AUG_08" 4 1
    "SEP_08" 4 0
    "OCT_08" 4 0
    "NOV_08" 4 1
    "DEC_08" 4 1
    end
    
    gen mdate = monthly(substr(month_year, 1, 3) + " 20" + substr(month_year, -2, 2), "MY")
    format mdate %tm
    
    rangestat (sum) total=edvisits, interval(mdate -2 0) by(studyid)
    rangestat (max) max=total, interval(studyid 0 0)
    gen wanted = max >= 3
    
    list, sepby(studyid)
    
         +----------------------------------------------------------------+
         | month_~r   studyid   edvisits     mdate   total   max   wanted |
         |----------------------------------------------------------------|
      1. |   AUG_08         1          0    2008m8       0     3        1 |
      2. |   SEP_08         1          2    2008m9       2     3        1 |
      3. |   OCT_08         1          1   2008m10       3     3        1 |
      4. |   NOV_08         1          0   2008m11       3     3        1 |
      5. |   DEC_08         1          0   2008m12       1     3        1 |
         |----------------------------------------------------------------|
      6. |   AUG_08         2          0    2008m8       0     1        0 |
      7. |   SEP_08         2          0    2008m9       0     1        0 |
      8. |   OCT_08         2          1   2008m10       1     1        0 |
      9. |   NOV_08         2          0   2008m11       1     1        0 |
     10. |   DEC_08         2          0   2008m12       1     1        0 |
         |----------------------------------------------------------------|
     11. |   AUG_08         3          1    2008m8       1     3        1 |
     12. |   SEP_08         3          0    2008m9       1     3        1 |
     13. |   OCT_08         3          2   2008m10       3     3        1 |
     14. |   NOV_08         3          0   2008m11       2     3        1 |
     15. |   DEC_08         3          0   2008m12       2     3        1 |
         |----------------------------------------------------------------|
     16. |   AUG_08         4          1    2008m8       1     2        0 |
     17. |   SEP_08         4          0    2008m9       1     2        0 |
     18. |   OCT_08         4          0   2008m10       1     2        0 |
     19. |   NOV_08         4          1   2008m11       1     2        0 |
     20. |   DEC_08         4          1   2008m12       2     2        0 |
         +----------------------------------------------------------------+

    Comment


    • #3
      Hi Nick,
      Thanks so much for this, it worked perfectly once I got rangestat installed. Apologies for the date confusion, I've clearly still got lots to learn. Your help is much appreciated.
      Mike

      Comment

      Working...
      X