Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to re-arrange my data?

    Dear Statalists,

    I'd like to rearrange my data in order to analyze it.

    Part of my dataset is:

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float(fake_id match_id)
     4  33
     4 171
     4 165
     8 237
     8  97
     8 179
    12  55
    12 199
    12  11
    16  19
    16 133
    16 129
    20  15
    20  35
    20 182
    24 147
    24  86
    24 185
    28  25
    28 211
    28   5
    32 213
    32  71
    32  70
    36  13
    36  23
    36 186
    40  62
    40  63
    40 207
    end

    What I wan to get is something like:
    id match_set
    4 1
    33 1
    171 1
    165 1
    8 2
    237 2
    97 2
    179 2
    12 3
    55 3
    199 3
    11 3
    ... ...
    Could anyone help with this?

    Thank you in advance.

  • #2
    This may help. At the end drop what you don't care about. Some users may be interested in using -1 for a true value (because I want true to sort before false).

    Note: The variable obs is not really needed, but I keep it in the example because in many problems requiring expand it is useful to return to the original sort order. Indeed it might be that

    Code:
    bysort set (first obs) : replace match_id = fake_id if _n == 1
    is better.

    Code:
    clear
    input float(fake_id match_id)
     4  33
     4 171
     4 165
     8 237
     8  97
     8 179
    12  55
    12 199
    12  11
    16  19
    16 133
    16 129
    20  15
    20  35
    20 182
    24 147
    24  86
    24 185
    28  25
    28 211
    28   5
    32 213
    32  71
    32  70
    36  13
    36  23
    36 186
    40  62
    40  63
    40 207
    end
    
    gen long obs = _n 
    gen set = sum(fake_id != fake_id[_n-1])
    gen first = - (fake_id != fake_id[_n-1]) 
    expand 2 if first 
    bysort set (first) : replace match_id = fake_id if _n == 1  
    
    list, sepby(set)
    
         +----------------------------------------+
         | fake_id   match_id   obs   set   first |
         |----------------------------------------|
      1. |       4          4     1     1      -1 |
      2. |       4         33     1     1      -1 |
      3. |       4        171     2     1       0 |
      4. |       4        165     3     1       0 |
         |----------------------------------------|
      5. |       8          8     4     2      -1 |
      6. |       8        237     4     2      -1 |
      7. |       8         97     5     2       0 |
      8. |       8        179     6     2       0 |
         |----------------------------------------|
      9. |      12         12     7     3      -1 |
     10. |      12         55     7     3      -1 |
     11. |      12        199     8     3       0 |
     12. |      12         11     9     3       0 |
         |----------------------------------------|
     13. |      16         16    10     4      -1 |
     14. |      16         19    10     4      -1 |
     15. |      16        129    12     4       0 |
     16. |      16        133    11     4       0 |
         |----------------------------------------|
     17. |      20         20    13     5      -1 |
     18. |      20         15    13     5      -1 |
     19. |      20         35    14     5       0 |
     20. |      20        182    15     5       0 |
         |----------------------------------------|
     21. |      24         24    16     6      -1 |
     22. |      24        147    16     6      -1 |
     23. |      24        185    18     6       0 |
     24. |      24         86    17     6       0 |
         |----------------------------------------|
     25. |      28         28    19     7      -1 |
     26. |      28         25    19     7      -1 |
     27. |      28        211    20     7       0 |
     28. |      28          5    21     7       0 |
         |----------------------------------------|
     29. |      32         32    22     8      -1 |
     30. |      32        213    22     8      -1 |
     31. |      32         70    24     8       0 |
     32. |      32         71    23     8       0 |
         |----------------------------------------|
     33. |      36         36    25     9      -1 |
     34. |      36         13    25     9      -1 |
     35. |      36        186    27     9       0 |
     36. |      36         23    26     9       0 |
         |----------------------------------------|
     37. |      40         40    28    10      -1 |
     38. |      40         62    28    10      -1 |
     39. |      40         63    29    10       0 |
     40. |      40        207    30    10       0 |
         +----------------------------------------+

    Comment


    • #3
      Originally posted by Nick Cox View Post
      This may help. At the end drop what you don't care about. Some users may be interested in using -1 for a true value (because I want true to sort before false).

      Note: The variable obs is not really needed, but I keep it in the example because in many problems requiring expand it is useful to return to the original sort order. Indeed it might be that

      Code:
      bysort set (first obs) : replace match_id = fake_id if _n == 1
      is better.

      Code:
      clear
      input float(fake_id match_id)
      4 33
      4 171
      4 165
      8 237
      8 97
      8 179
      12 55
      12 199
      12 11
      16 19
      16 133
      16 129
      20 15
      20 35
      20 182
      24 147
      24 86
      24 185
      28 25
      28 211
      28 5
      32 213
      32 71
      32 70
      36 13
      36 23
      36 186
      40 62
      40 63
      40 207
      end
      
      gen long obs = _n
      gen set = sum(fake_id != fake_id[_n-1])
      gen first = - (fake_id != fake_id[_n-1])
      expand 2 if first
      bysort set (first) : replace match_id = fake_id if _n == 1
      
      list, sepby(set)
      
      +----------------------------------------+
      | fake_id match_id obs set first |
      |----------------------------------------|
      1. | 4 4 1 1 -1 |
      2. | 4 33 1 1 -1 |
      3. | 4 171 2 1 0 |
      4. | 4 165 3 1 0 |
      |----------------------------------------|
      5. | 8 8 4 2 -1 |
      6. | 8 237 4 2 -1 |
      7. | 8 97 5 2 0 |
      8. | 8 179 6 2 0 |
      |----------------------------------------|
      9. | 12 12 7 3 -1 |
      10. | 12 55 7 3 -1 |
      11. | 12 199 8 3 0 |
      12. | 12 11 9 3 0 |
      |----------------------------------------|
      13. | 16 16 10 4 -1 |
      14. | 16 19 10 4 -1 |
      15. | 16 129 12 4 0 |
      16. | 16 133 11 4 0 |
      |----------------------------------------|
      17. | 20 20 13 5 -1 |
      18. | 20 15 13 5 -1 |
      19. | 20 35 14 5 0 |
      20. | 20 182 15 5 0 |
      |----------------------------------------|
      21. | 24 24 16 6 -1 |
      22. | 24 147 16 6 -1 |
      23. | 24 185 18 6 0 |
      24. | 24 86 17 6 0 |
      |----------------------------------------|
      25. | 28 28 19 7 -1 |
      26. | 28 25 19 7 -1 |
      27. | 28 211 20 7 0 |
      28. | 28 5 21 7 0 |
      |----------------------------------------|
      29. | 32 32 22 8 -1 |
      30. | 32 213 22 8 -1 |
      31. | 32 70 24 8 0 |
      32. | 32 71 23 8 0 |
      |----------------------------------------|
      33. | 36 36 25 9 -1 |
      34. | 36 13 25 9 -1 |
      35. | 36 186 27 9 0 |
      36. | 36 23 26 9 0 |
      |----------------------------------------|
      37. | 40 40 28 10 -1 |
      38. | 40 62 28 10 -1 |
      39. | 40 63 29 10 0 |
      40. | 40 207 30 10 0 |
      +----------------------------------------+
      Many thanks to your solution.

      Comment

      Working...
      X