Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Create time sequence

    Hi Stata users,

    I have the data in the structure below

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float(id time)
    21 5
    22 3
    23 2
    end

    and would want it to be as shown below

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float(id time event)
    21 1 0
    21 2 0
    21 3 0
    21 4 0
    21 5 1
    22 1 0
    22 2 0
    22 3 1
    22 4 0
    22 5 0
    23 1 0
    23 2 1
    23 3 0
    23 4 0
    23 5 0
    end

    Thanks in advance

  • #2
    Based on your minimal data example, it's not clear the logic that defines the max time. I will assume that it is the largest observed time in your dataset (but you can change this). From here, it's relatively straightforward if you understand how to count over groups.

    Code:
    summ time, meanonly
    expand `r(max)'
    rename time want
    bys id: gen time = _n
    bys id: gen byte event = _n==want
    drop want
    As another approach, a reshape "trick" could work, but it has its own pitfalls.

    Comment


    • #3
      In your example data, each id appears only once. If this is the case also in your real data, then a simpler solution is possible. But the following, a bit more complicated, will work whether an id can appear more than once or not. It does, however, require, that if a given id appears more than once, the time variable must differ. The same combination of id and time cannot appear more than once. The code verifies and enforces this restriction, and will break at the -isid- command if it is violated.

      Code:
      clear*
      
      * Example generated by -dataex-. For more info, type help dataex
      clear
      input float(id time)
      21 5
      22 3
      23 2
      end
      
      isid id time, sort
      
      frame put id, into(expanded)
      frame change expanded
      duplicates drop
      expand 5
      by id, sort: gen time = _n
      frlink 1:1 id time, frame(default)
      gen byte event = !missing(default)
      
      drop default
      frame drop default
      
      list, noobs clean
      Added: Crossed with #2, which uses a different approach, and also raises further questions about the underlying data.
      Last edited by Clyde Schechter; 20 Oct 2021, 12:43.

      Comment


      • #4
        Thanks so much Leonardo Guizzetti for your response. You are right - the maximum time is the largest observed value.

        Comment


        • #5
          Clyde Schechter Thanks a ton for providing a solution that addresses a complex problem. This is much appreciated.

          Comment

          Working...
          X