Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Adjusting times across survey dates

    Hello everyone,

    I have a question building on a previous post of mine (here: https://www.statalist.org/forums/for...ly-survey-data) where I have survey data of job shift start-stop times, measured daily for individual respondents.

    One nuanced complication is that (occasionally) respondents taking the survey would indicate their shift for "that day" was really an overnight shift that ended earlier that morning but technically began the day prior. For example, Respondent A would indicate on October 10 that they had a shift that began at 9pm and ended at 2am. (Note that these instances are easily detected by subtracting stop - start, which should have a positive value unless the shift crosses over midnight).

    In these cases, we'd ideally want to split out these hours worked by the actual day the work took place and create rows according to the actual date worked. In my hypothetical example, Person A's row for October 10 should be start = 0 and stop = 2 (start and stop times are numeric in 24-hour time format), and a new row would be added for for Person A on October 9 where the start = 21 and stop = 23.99.


    Some simplified example data:

    Code:
    clear
    input float(date start stop) long person_id
    22200   22  3  1
    21866   8  12  1
    22189   20  1  1
    22208   20.5  3  1
    22213   .  .  2
    22219   14 0  2
    22195   22  4  2
    21892 7.5 16  2
    22232   8  12  3
    21862   8  14  3
    21869   23.5  1.25  3
    21836   7  12  3
    end
    format %td date
    sort person_id date


    And here's roughly how I'd hope the resulting dataset looked:
    date start stop person_id
    11/13/2019 8 12 1
    9/30/2020 20 23.99 1
    10/1/2020 0 1 1
    10/11/2020 22 22.99 1
    10/12/2020 0 3 1
    10/19/2020 21 23.99 1
    10/20/2020 0 3 1
    12/9/2019 7.5 16 2
    10/6/2020 22 23.99 2
    10/7/2020 0 4 2
    10/25/2020 2
    10/31/2020 14 0 2
    10/14/2019 7 12 3
    11/9/2019 8 14 3
    11/15/2019 23.5 23.99 3
    11/16/2019 0 1.25 3
    11/13/2020 8 12 3

  • #2
    Code:
    isid person_id date, sort
    
    frame put _all if stop < start, into(double_days)
    frame double_days {
        expand 2
        by person_id date, sort: replace stop = 23.99 if _n == 1
        by person_id date: replace start = 0 if _n == 2
        by person_id date: replace date = date -1 if _n == 1
    }
    
    drop if stop < start
    frameappend double_days, drop
    sort person_id date start
    Note: -frameappend- is by Jeremy Freese, and is available from SSC.

    One little wrinkle here. Suppose I work from 22:00 on July 1 through 6:00 on July 2 and also work from 20:00 to 23:30 on July 2. After applying this method, it will no longer be the case that id and date uniquely identify observations in the data: there will now be two observations for me on July 2, corresponding to the two, separated periods I worked. Thank you for posting the link to the earlier thread involving this (or closely related) data. I scanned the code there and I notice that it begins with -isid person_id survey_date-. That line will now break with an error message if you apply that code to the results from the current code. Further examining the rest of the code, I don't think that that verification of unique identification was actually necessary for the old code to work. So I think if you remove that -isid person_id survey_date- from that code, it will run correctly. But I have not attempted to test that out with data. If you run into a problem, post back with an example that reproduces it and I'll try to troubleshoot.

    Comment


    • #3
      Thanks very much Clyde Schechter , this is exactly what I was looking for. You're correct about that unique observation wrinkle, I'd considered that before but will have to do some further testing with the real data now to see if there are any implications/problems that I haven't anticipated. I'll post back if so. Thanks again!

      Comment

      Working...
      X