Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Help with sorting on three variables and ties

    Hello,

    I have an issue with sorting my data. I have three variables and I want it to be sorted on lopnr and year, and if there are any ties in years it should be sorted so that the value cfar took in the last year ends up highest when it is tied. The first table, at year 2017, illustrates how I want it to look like. If I sort in lopnr year cfar it will just end up like second table and there is my issue. I'm not able to provide you with my dataset due to confidentialty issues.

    I thought I could create a new ID variable and then sorting on that but then I get it like in the second table but I would like it how it looks like in the first table.

    for the new ID I ran

    egen ID=group(lopnr cfar)

    Is there any way to force egen group to do what I want? Do you have any other ideas how to solve this?

    Cfar lopnr year ID
    275 42 2014 1
    275 42 2015 1
    275 42 2016 1
    275 42 2017 1
    192 42 2017 2
    192 42 2018 2
    Cfar lopnr year ID
    275 42 2014 2
    275 42 2015 2
    275 42 2016 2
    192 42 2017 1
    275 42 2017 2
    192 42 2018 1

  • #2
    To be completely honest, I'm having a little trouble parsing your post. That said, it seems like you'd like to sort by lopnr, within lopnr by year, and within year by cfar, but cfar should be sorted in descending order. I don't think you need an id variable, just gsort.

    Code:
    clear
    input int(cfar    lopnr    year    ID)
    275    42    2014    2
    275    42    2015    2
    275    42    2016    2
    192    42    2017    1
    275    42    2017    2
    192    42    2018    1
    end
    
    drop ID
    
    gsort lopnr year -cfar
    
    list, clean noobs
    Code:
    . list, clean noobs
    
        cfar   lopnr   year  
         275      42   2014  
         275      42   2015  
         275      42   2016  
         275      42   2017  
         192      42   2017  
         192      42   2018

    Comment


    • #3
      Thanks for your answer!

      Indeed, your coding would solve my issue in my limited example. However, the patterns does not look like this all the time in the data. It might look like teh table below, and if I used your code that series of observations (and similar ones) would end up in the wrong order.
      cfar lopnr year
      123 43 2013
      123 43 2014
      321 43 2015
      123 43 2015
      321 43 2017

      Comment


      • #4
        Okay, so if I'm understanding you correctly, if the same year appears twice in a row, the second appearance should appear first? So, essentially, there is already a meaningful ordering here, and you want to reverse it within lopnr and years. Does that mean in situations where you have the same year repeated three or more times, the entire ordering should be reversed? Or do you not have cases where the same year is repeated three or more times?

        In either of those cases, it seems like you want this:

        Code:
        clear
        input int(cfar    lopnr    year)
        275    42    2014
        275    42    2015
        275    42    2016
        192    42    2017
        275    42    2017
        192    42    2018
        123    43    2013
        123    43    2014
        321    43    2015
        123    43    2015
        321    43    2017
        end
        
        bysort lopnr year: gen order = -_n
        sort lopnr year order
        
        list, clean noobs
        Code:
        . list, clean noobs
        
            cfar   lopnr   year   order  
             275      42   2014      -1  
             275      42   2015      -1  
             275      42   2016      -1  
             275      42   2017      -2  
             192      42   2017      -1  
             192      42   2018      -1  
             123      43   2013      -1  
             123      43   2014      -1  
             123      43   2015      -2  
             321      43   2015      -1  
             321      43   2017      -1
        Okay, I think I can see what you were going for with the "ID" variable.

        Comment

        Working...
        X