Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • adding top 3 variables with the highest value out of 6 variables for each observation

    Hi,

    I have 6 variables each having a range of values. For each observation I want to add the 3 variables with the highest values from the 6 variables. Each observation has a different 'highest 3 variable' combination. How can I do this?

    Thank you,

  • #2
    You could use rowsort from the Stata Journal. This example ignores missings, which seems to be usually wanted. I created some fake data, because, contrary to our advice, you didn't give a data example.

    Code:
    clear
    set obs 10 
    set seed 2803
    
    forval j = 1/6 { 
          gen x`j' = cond(runiform() < 0.05, ., runiformint(1, 9)) 
     } 
     
    rowsort x?, gen(X1 X2 X3 X4 X5 X6) descending highmissing 
    
    gen wanted = X1 + X2 + X3 
     
    list, sep(0) 
    
         +--------------------------------------------------------------------+
         | x1   x2   x3   x4   x5   x6   X1   X2   X3   X4   X5   X6   wanted |
         |--------------------------------------------------------------------|
      1. |  6    7    5    9    5    6    9    7    6    6    5    5       22 |
      2. |  2    3    .    6    1    2    6    3    2    2    1    .       11 |
      3. |  1    2    8    4    8    5    8    8    5    4    2    1       21 |
      4. |  7    2    8    3    2    8    8    8    7    3    2    2       23 |
      5. |  .    1    1    9    8    2    9    8    2    1    1    .       19 |
      6. |  2    3    6    4    9    8    9    8    6    4    3    2       23 |
      7. |  2    4    4    8    3    4    8    4    4    4    3    2       16 |
      8. |  4    4    5    5    2    4    5    5    4    4    4    2       14 |
      9. |  3    4    4    6    8    2    8    6    4    4    3    2       18 |
     10. |  7    5    3    4    2    9    9    7    5    4    3    2       21 |
         +--------------------------------------------------------------------+

    Comment


    • #3
      A second approach, demonstrated with Nick's very helpful example data, might be useful if for some reason you are not able to easily install SSC and Stata Journal packages on your system. Overall, though, the use of rowsort would be my preference.
      Code:
      generate id = _n
      reshape long x, i(id) j(xnum)
      drop if x==.
      bysort id (x): generate wanted = x[_N]+x[_N-1]+x[_N-2]
      reshape wide x, i(id) j(xnum) 
      drop id
      list, sep(0)
      Code:
      . list, sep(0) 
      
           +--------------------------------------+
           | x1   x2   x3   x4   x5   x6   wanted |
           |--------------------------------------|
        1. |  6    7    5    9    5    6       22 |
        2. |  2    3    .    6    1    2       11 |
        3. |  1    2    8    4    8    5       21 |
        4. |  7    2    8    3    2    8       23 |
        5. |  .    1    1    9    8    2       19 |
        6. |  2    3    6    4    9    8       23 |
        7. |  2    4    4    8    3    4       16 |
        8. |  4    4    5    5    2    4       14 |
        9. |  3    4    4    6    8    2       18 |
       10. |  7    5    3    4    2    9       21 |
           +--------------------------------------+

      Comment

      Working...
      X