Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Variable Construction based on Time Stamps

    Dear Statalist,

    I have a dataset that records the time when staffs successfully pick the products according to the bar code. For instance, in the first row, the data means that the first picker (pickid=1) pick a certain item at 01oct2019 12:41:27.
    I observe that during a certain day, a picker may work several shifts. For instance, in the 8th observation, pickid 1 picks the item at 01oct2019 15:00:35, then he took a rest for about two hours, and starts his second shift during that day at around 17:00.

    clear
    input float pickid double pick float(date shift)
    1 1885552897000 21823 1
    1 1885553368000 21823 1
    1 1885553548000 21823 1
    1 1885554267000 21823 1
    1 1885555750000 21823 1
    1 1885559182000 21823 1
    1 1885561029000 21823 1
    1 1885561235000 21823 1
    1 1885568751000 21823 2
    1 1885568794000 21823 2
    1 1889124609000 21864 1
    1 1889125397000 21864 1
    1 1889125511000 21864 1
    1 1889126216000 21864 1
    1 1889126746000 21864 1
    1 1889127005000 21864 1
    1 1889127208000 21864 1
    1 1889128021000 21864 1
    1 1889128155000 21864 1
    1 1889128219000 21864 1
    1 1891763437000 21895 1
    1 1891763843000 21895 1
    1 1891765585000 21895 1
    1 1891765845000 21895 1
    1 1891766075000 21895 1
    1 1891766170000 21895 1
    1 1891767753000 21895 1
    1 1891768296000 21895 1
    1 1891790940000 21895 2
    1 1891791650000 21895 2
    2 1889050832000 21864 1
    2 1889051543000 21864 1
    2 1889055404000 21864 1
    2 1889058548000 21864 1
    2 1889061753000 21864 1
    2 1889062340000 21864 1
    2 1889063595000 21864 1
    2 1889065232000 21864 1
    2 1889116535000 21864 2
    2 1889116760000 21864 2
    end
    format %tc pick
    format %td date
    [/CODE]



    I want to generate a variable SHIFT showing the shifts a picker works a day. For instance, shift=1 if it is the first shift of the picker during the day. If the time gap between the current pick and the previous pick is more than 2 hours, then the current pick (and later picks) belongs to the second shift, and so on so forth. A picker can have several shifts a day. Can someone help me generate a SHIFT variable?

    In addition, I also want to generate two experience variables. The first variable is called CURRENTEXP, which shows the picker’s cumulative working hours until the current pick during a day. I want to use the following logic to construct a variable as a proxy for CURRENTEXP, please let me know if it is not appropriate. For instance, for picker 1 in the first shift in the first day, for the current pick, picker’s CURRENTEXP should be the previous pick time minus the first pick time in shift 1 (01oct2019 12:41:37). For picker 1 in the second shift, picker’s CURRENTEXP should be [(the last pick time in shift 1-the first pick time in shift 1)+(previous pick time-first pick time in shift 2)], and so on so forth.

    The second variable is called TOTEXP, which shows the picker’s cumulative working hour until the current pick. For instance, during the nth day, for picker 1 in the first shift, the TOTEXP should be cumulative working hour during n-1 days plus the cumulative working hour during that day. The logic I have in mind is as follows, and please let me know if I am wrong. Can someone help me generate these two variables?
    (Day1 shift1 last pick time-Day1 shift 1 first pick time)+ (Day1 shift2 last pick time-Day1 shift 2 first pick time)+(Day2 shift1 last pick time-Day2 shift 1 first pick time)+…

    The last question is about the estimation techniques. For each pick, I can actually calculate the time duration that a picker takes to find the product. For instance, for product 1, it takes picker 1 10s to find it. For product 2, it takes picker 1 20s to find it. I have multiple products, multiple pickers, and multiple days. I see the literature using the AFT model to model the factors affecting pick time, such as picker experience. I am wondering is it appropriate to use the AFT mode? I would use the following code:
    stset pick_time ////should I take log here?
    streg totexp currentexp, distribution(lognormal) time nolog

    I am sorry for having so many questions since I do not have much experience dealing with such datasets. I hope I have made myself clear. Any help will be highly appreciated. Thank you very much!
    Best,
    Changjun


  • #2
    Dear Statalist,

    I have been working on these problems on my own for a while. For the first question, I tried the following code, assuming there is only one day and one picker, the code works.

    bys pickid date: gen double break=(pick-pick[_n-1])/1000

    gen shift=1
    local N = _N
    local i=1
    forvalues i = 1/`N' {
    replace shift=shift[_n-1] in `i' if _n>=2
    replace shift=shift[_n-1]+1 in `i' if break[`i']>=7200 & break[`i']!=.
    }

    Then I try to generalize it to multiple pickers and multiple days using the bys code, the code does not work.
    gen shift=1
    local N = _N
    local i=1
    forvalues i = 1/`N' {
    by pickid date: replace shift=shift[_n-1] in `i' if _n>=2
    by pickid date: replace shift=shift[_n-1]+1 in `i' if break[`i']>=7200 & break[`i']!=.
    }

    Can someone give some suggestions or directions?
    Than you very much.

    Best,
    Changjun



    Comment

    Working...
    X