Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Expand data with pairwise combinations

    Hello,

    I have the following trip data:


    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str8 date byte trip_id int vehicle str1(stop_number origin) str13 origin_time str1 destination str13 destination_time
    "1-Jan-24" 1 1000 "1" "A" "1/1/2024 6:00" "B" "1/1/2024 6:03"
    "1-Jan-24" 1 1000 "2" "B" "1/1/2024 6:05" "C" "1/1/2024 6:09"
    "1-Jan-24" 1 1000 "3" "C" "1/1/2024 6:12" "D" "1/1/2024 6:15"
    "1-Jan-24" 1 1000 "4" "D" "1/1/2024 6:17" "E" "1/1/2024 6:26"
    "1-Jan-24" 1 1000 "5" "E" "1/1/2024 6:28" "F" "1/1/2024 6:32"
    "1-Jan-24" 2 1001 "1" "A" "1/1/2024 7:00" "B" "1/1/2024 7:03"
    "1-Jan-24" 2 1001 "2" "B" "1/1/2024 7:05" "C" "1/1/2024 7:09"
    "1-Jan-24" 2 1001 "3" "C" "1/1/2024 7:12" "D" "1/1/2024 7:15"
    "1-Jan-24" 2 1001 "4" "D" "1/1/2024 7:17" "E" "1/1/2024 7:26"
    "1-Jan-24" 2 1001 "5" "E" "1/1/2024 7:28" "F" "1/1/2024 7:32"
    end
    I would like to create pairwise combinations in a way that each origin location is placed before a unique destination in order. The data is unique on date, trip_id, vehicle, and stop_number.

    Essentially this is how I would like the data to end up looking like:

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str8 date byte trip_id str1 origin str13 origin_time str1 destination str13 destination_time
    "1-Jan-24" 1 "A" "1/1/2024 6:00" "B" "1/1/2024 6:03"
    "1-Jan-24" 1 "A" "1/1/2024 6:00" "C" "1/1/2024 6:09"
    "1-Jan-24" 1 "A" "1/1/2024 6:00" "D" "1/1/2024 6:15"
    "1-Jan-24" 1 "A" "1/1/2024 6:00" "E" "1/1/2024 6:26"
    "1-Jan-24" 1 "A" "1/1/2024 6:00" "F" "1/1/2024 6:32"
    "1-Jan-24" 1 "B" "1/1/2024 6:05" "C" "1/1/2024 6:09"
    "1-Jan-24" 1 "B" "1/1/2024 6:05" "D" "1/1/2024 6:15"
    "1-Jan-24" 1 "B" "1/1/2024 6:05" "E" "1/1/2024 6:26"
    "1-Jan-24" 1 "B" "1/1/2024 6:05" "F" "1/1/2024 6:32"
    "1-Jan-24" 1 "C" "1/1/2024 6:12" "D" "1/1/2024 6:15"
    "1-Jan-24" 1 "C" "1/1/2024 6:12" "E" "1/1/2024 6:26"
    "1-Jan-24" 1 "C" "1/1/2024 6:12" "F" "1/1/2024 6:32"
    "1-Jan-24" 1 "D" "1/1/2024 6:17" "E" "1/1/2024 6:26"
    "1-Jan-24" 1 "D" "1/1/2024 6:17" "F" "1/1/2024 6:32"
    "1-Jan-24" 1 "E" "1/1/2024 6:28" "F" "1/1/2024 6:32"
    "1-Jan-24" 2 "A" "1/1/2024 7:00" "B" "1/1/2024 7:03"
    "1-Jan-24" 2 "A" "1/1/2024 7:00" "C" "1/1/2024 7:09"
    "1-Jan-24" 2 "A" "1/1/2024 7:00" "D" "1/1/2024 7:15"
    "1-Jan-24" 2 "A" "1/1/2024 7:00" "E" "1/1/2024 7:26"
    "1-Jan-24" 2 "A" "1/1/2024 7:00" "F" "1/1/2024 7:32"
    "1-Jan-24" 2 "B" "1/1/2024 7:05" "C" "1/1/2024 7:09"
    "1-Jan-24" 2 "B" "1/1/2024 7:05" "D" "1/1/2024 7:15"
    "1-Jan-24" 2 "B" "1/1/2024 7:05" "E" "1/1/2024 7:26"
    "1-Jan-24" 2 "B" "1/1/2024 7:05" "F" "1/1/2024 7:32"
    "1-Jan-24" 2 "C" "1/1/2024 7:12" "D" "1/1/2024 7:15"
    "1-Jan-24" 2 "C" "1/1/2024 7:12" "E" "1/1/2024 7:26"
    "1-Jan-24" 2 "C" "1/1/2024 7:12" "F" "1/1/2024 7:32"
    "1-Jan-24" 2 "D" "1/1/2024 7:17" "E" "1/1/2024 7:26"
    "1-Jan-24" 2 "D" "1/1/2024 7:17" "F" "1/1/2024 7:32"
    "1-Jan-24" 2 "E" "1/1/2024 7:28" "F" "1/1/2024 7:32"
    end
    I tried using the "cross" from Stata but that did not seem like an efficient strategy. Hoping to hear from the pros here.

  • #2
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str8 date byte trip_id int vehicle str1(stop_number origin) str13 origin_time str1 destination str13 destination_time
    "1-Jan-24" 1 1000 "1" "A" "1/1/2024 6:00" "B" "1/1/2024 6:03"
    "1-Jan-24" 1 1000 "2" "B" "1/1/2024 6:05" "C" "1/1/2024 6:09"
    "1-Jan-24" 1 1000 "3" "C" "1/1/2024 6:12" "D" "1/1/2024 6:15"
    "1-Jan-24" 1 1000 "4" "D" "1/1/2024 6:17" "E" "1/1/2024 6:26"
    "1-Jan-24" 1 1000 "5" "E" "1/1/2024 6:28" "F" "1/1/2024 6:32"
    "1-Jan-24" 2 1001 "1" "A" "1/1/2024 7:00" "B" "1/1/2024 7:03"
    "1-Jan-24" 2 1001 "2" "B" "1/1/2024 7:05" "C" "1/1/2024 7:09"
    "1-Jan-24" 2 1001 "3" "C" "1/1/2024 7:12" "D" "1/1/2024 7:15"
    "1-Jan-24" 2 1001 "4" "D" "1/1/2024 7:17" "E" "1/1/2024 7:26"
    "1-Jan-24" 2 1001 "5" "E" "1/1/2024 7:28" "F" "1/1/2024 7:32"
    end
    
    preserve
    keep date trip_id vehicle destination destination_time
    tempfile dest
    save `dest'
    restore
    keep date trip_id vehicle origin origin_time
    joinby date trip_id vehicle using `dest' 
    bys date trip_id vehicle (origin destination): keep if origin< destination
    Res.:

    Code:
    . l, sepby(date trip_id)
    
         +----------------------------------------------------------------------------------+
         |     date   trip_id   vehicle   origin     origin_time   destin~n   destination~e |
         |----------------------------------------------------------------------------------|
      1. | 1-Jan-24         1      1000        A   1/1/2024 6:00          B   1/1/2024 6:03 |
      2. | 1-Jan-24         1      1000        A   1/1/2024 6:00          C   1/1/2024 6:09 |
      3. | 1-Jan-24         1      1000        A   1/1/2024 6:00          D   1/1/2024 6:15 |
      4. | 1-Jan-24         1      1000        A   1/1/2024 6:00          E   1/1/2024 6:26 |
      5. | 1-Jan-24         1      1000        A   1/1/2024 6:00          F   1/1/2024 6:32 |
      6. | 1-Jan-24         1      1000        B   1/1/2024 6:05          C   1/1/2024 6:09 |
      7. | 1-Jan-24         1      1000        B   1/1/2024 6:05          D   1/1/2024 6:15 |
      8. | 1-Jan-24         1      1000        B   1/1/2024 6:05          E   1/1/2024 6:26 |
      9. | 1-Jan-24         1      1000        B   1/1/2024 6:05          F   1/1/2024 6:32 |
     10. | 1-Jan-24         1      1000        C   1/1/2024 6:12          D   1/1/2024 6:15 |
     11. | 1-Jan-24         1      1000        C   1/1/2024 6:12          E   1/1/2024 6:26 |
     12. | 1-Jan-24         1      1000        C   1/1/2024 6:12          F   1/1/2024 6:32 |
     13. | 1-Jan-24         1      1000        D   1/1/2024 6:17          E   1/1/2024 6:26 |
     14. | 1-Jan-24         1      1000        D   1/1/2024 6:17          F   1/1/2024 6:32 |
     15. | 1-Jan-24         1      1000        E   1/1/2024 6:28          F   1/1/2024 6:32 |
         |----------------------------------------------------------------------------------|
     16. | 1-Jan-24         2      1001        A   1/1/2024 7:00          B   1/1/2024 7:03 |
     17. | 1-Jan-24         2      1001        A   1/1/2024 7:00          C   1/1/2024 7:09 |
     18. | 1-Jan-24         2      1001        A   1/1/2024 7:00          D   1/1/2024 7:15 |
     19. | 1-Jan-24         2      1001        A   1/1/2024 7:00          E   1/1/2024 7:26 |
     20. | 1-Jan-24         2      1001        A   1/1/2024 7:00          F   1/1/2024 7:32 |
     21. | 1-Jan-24         2      1001        B   1/1/2024 7:05          C   1/1/2024 7:09 |
     22. | 1-Jan-24         2      1001        B   1/1/2024 7:05          D   1/1/2024 7:15 |
     23. | 1-Jan-24         2      1001        B   1/1/2024 7:05          E   1/1/2024 7:26 |
     24. | 1-Jan-24         2      1001        B   1/1/2024 7:05          F   1/1/2024 7:32 |
     25. | 1-Jan-24         2      1001        C   1/1/2024 7:12          D   1/1/2024 7:15 |
     26. | 1-Jan-24         2      1001        C   1/1/2024 7:12          E   1/1/2024 7:26 |
     27. | 1-Jan-24         2      1001        C   1/1/2024 7:12          F   1/1/2024 7:32 |
     28. | 1-Jan-24         2      1001        D   1/1/2024 7:17          E   1/1/2024 7:26 |
     29. | 1-Jan-24         2      1001        D   1/1/2024 7:17          F   1/1/2024 7:32 |
     30. | 1-Jan-24         2      1001        E   1/1/2024 7:28          F   1/1/2024 7:32 |
         +----------------------------------------------------------------------------------+
    
    .

    Comment


    • #3
      Thanks Andrew Musau this is a really nice tip

      Comment

      Working...
      X