Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Lowest Values in Entire Group

    Hello,

    Would appreciate some help please...
    Low 1-2-3-4- are the wanted columns here
    I need to generate those 4 variables that return the lowest, second lowest, third lowest and fourth lowest values in IPS_L3 by the group in the first column

    Thanks

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input byte group float(IPS_L3 low1 low2 low3 low4)
    1 11 3 6 7 8
    1 15 3 6 7 8
    1 10 3 6 7 8
    1  8 3 6 7 8
    1 12 3 6 7 8
    1  8 3 6 7 8
    1 12 3 6 7 8
    1  7 3 6 7 8
    1  3 3 6 7 8
    1 11 3 6 7 8
    1  6 3 6 7 8
    end

  • #2
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input byte group float IPS_L3
    1 11
    1 15
    1 10
    1  8
    1 12
    1  8
    1 12
    1  7
    1  3
    1 11
    1  6
    end
    
    egen temprank = rank(IPS_L3), unique
    
    forvalues x = 1/4{
        bysort group: egen low`x' = max((temprank == `x')*IPS_L3)
    }
    
    list, sep(0)
    Result:

    Code:
         +-------------------------------------------------------+
         | group   IPS_L3   temprank   low1   low2   low3   low4 |
         |-------------------------------------------------------|
      1. |     1       11          8      3      6      7      8 |
      2. |     1       15         11      3      6      7      8 |
      3. |     1       10          6      3      6      7      8 |
      4. |     1        8          4      3      6      7      8 |
      5. |     1       12          9      3      6      7      8 |
      6. |     1        8          5      3      6      7      8 |
      7. |     1       12         10      3      6      7      8 |
      8. |     1        7          3      3      6      7      8 |
      9. |     1        3          1      3      6      7      8 |
     10. |     1       11          7      3      6      7      8 |
     11. |     1        6          2      3      6      7      8 |
         +-------------------------------------------------------+

    Comment


    • #3
      Thanks Ken, that's working fine.
      Much appreciated,
      Hans

      Comment


      • #4
        #2 ranks every value and then looks for the lowest 4 within each group. #3 implies that this is what is wanted, which surprises me.

        For one, the four lowest values will often be sprinkled among the various groups.

        For another, the code will return 0 if one of those is not found, but nothing says that 0 is not a legitimate value. (If it is, then returning missing is a fix for that.)

        This is how I understood the problem and here is a solution using rangestat from SSC.


        Code:
        * Example generated by -dataex-. For more info, type help dataex
        clear
        input byte group float IPS_L3
        1 11
        1 15
        1 10
        1  8
        1 12
        1  8
        1 12
        1  7
        1  3
        1 11
        1  6
        2  42
        2  666
        2  7
        2  5
        2  1
        end
        
        mata:  
        mata clear
        real rowvector low4_(real matrix X) {
            _sort(X, 1)
            return(X[1], X[2], X[3], X[4])
        }
        end
        
        rangestat (low4_) IPS_L3, interval(group 0 0)
        
        list, sepby(group)
        
        
             +----------------------------------------------------+
             | group   IPS_L3   low4_1   low4_2   low4_3   low4_4 |
             |----------------------------------------------------|
          1. |     1       11        3        6        7        8 |
          2. |     1       15        3        6        7        8 |
          3. |     1       10        3        6        7        8 |
          4. |     1        8        3        6        7        8 |
          5. |     1       12        3        6        7        8 |
          6. |     1        8        3        6        7        8 |
          7. |     1       12        3        6        7        8 |
          8. |     1        7        3        6        7        8 |
          9. |     1        3        3        6        7        8 |
         10. |     1       11        3        6        7        8 |
         11. |     1        6        3        6        7        8 |
             |----------------------------------------------------|
         12. |     2       42        1        5        7       42 |
         13. |     2      666        1        5        7       42 |
         14. |     2        7        1        5        7       42 |
         15. |     2        5        1        5        7       42 |
         16. |     2        1        1        5        7       42 |
             +----------------------------------------------------+
        Last edited by Nick Cox; 13 Oct 2022, 18:14.

        Comment


        • #5
          Thanks Nick, will take a better look in the morning. I needed to put a 'bysort group:' in front of the egen temprank....to get what I needed

          Comment


          • #6
            That report in #5 is entirely consistent with my comment -- but is not what you said in #3!

            Comment


            • #7
              Although I enjoyed thinking up a rangestat solution, as already implied you don't need to install a community-contributed command to do this. Here is another way to do it, which may seem less tricky.


              Code:
              * Example generated by -dataex-. For more info, type help dataex
              clear
              input byte group float IPS_L3
              1 11
              1 15
              1 10
              1  8
              1 12
              1  8
              1 12
              1  7
              1  3
              1 11
              1  6
              2  42
              2  666
              2  7
              2  5
              2  1
              end
              
              gen long order = _n 
              sort group IPS_L3 
              forval j = 1/4 { 
                  by group: gen low4_`j' = IPS_L3[`j']
              }
              
              sort order 
              
              list, sepby(group)
              
              
                  +------------------------------------------------------------+
                   | group   IPS_L3   order   low4_1   low4_2   low4_3   low4_4 |
                   |------------------------------------------------------------|
                1. |     1       11       1        3        6        7        8 |
                2. |     1       15       2        3        6        7        8 |
                3. |     1       10       3        3        6        7        8 |
                4. |     1        8       4        3        6        7        8 |
                5. |     1       12       5        3        6        7        8 |
                6. |     1        8       6        3        6        7        8 |
                7. |     1       12       7        3        6        7        8 |
                8. |     1        7       8        3        6        7        8 |
                9. |     1        3       9        3        6        7        8 |
               10. |     1       11      10        3        6        7        8 |
               11. |     1        6      11        3        6        7        8 |
                   |------------------------------------------------------------|
               12. |     2       42      12        1        5        7       42 |
               13. |     2      666      13        1        5        7       42 |
               14. |     2        7      14        1        5        7       42 |
               15. |     2        5      15        1        5        7       42 |
               16. |     2        1      16        1        5        7       42 |
                   +------------------------------------------------------------+
              .
              Note that if in one or more groups, there were only 1, 2 or 3 values to play with, then some new variables would be returned as missing in some observations, which would be right.

              Comment


              • #8
                A couple of comments:
                • In addition to not correctly dealing with groups with less than 4 observations, another weakness of the code in #2, is that it is not robust to negative values in IPS_L3.
                • a tweak to the code in #2 that would address both problems is as follows (Nick's code in #7 also deals with them fine):
                Code:
                sort group
                by group: egen temprank = rank(IPS_L3), unique
                
                forvalues x = 1/4{
                    by group: egen low`x' = max(cond(temprank == `x',IPS_L3,.))
                }
                With somewhat different data, constructed to illustrate these situations, we get:
                Code:
                . list, sepby(group)
                
                     +---------------------------------------------------------------+
                     | group   IPS_L3   order   temprank   low1   low2   low3   low4 |
                     |---------------------------------------------------------------|
                  1. |     1       11       1          1     11     15      .      . |
                  2. |     1       15       2          2     11     15      .      . |
                     |---------------------------------------------------------------|
                  3. |     2       10       3          3      8      8     10     12 |
                  4. |     2        8       4          2      8      8     10     12 |
                  5. |     2       12       5          4      8      8     10     12 |
                  6. |     2        8       6          1      8      8     10     12 |
                     |---------------------------------------------------------------|
                  7. |     3       12       7          5     -3     -2     -1     11 |
                  8. |     3       -1       8          3     -3     -2     -1     11 |
                  9. |     3       -3       9          1     -3     -2     -1     11 |
                 10. |     3       11      10          4     -3     -2     -1     11 |
                 11. |     3       -2      11          2     -3     -2     -1     11 |
                     +---------------------------------------------------------------+
                Last edited by Hemanshu Kumar; 14 Oct 2022, 07:09.

                Comment

                Working...
                X