Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Ranking with multiple sub-division

    Hi to all,
    I have a problem managing a dataset on some electoral outcomes. I have a list of elections in different Municipalities for different years. Not all the Municipalities have the same number of elections (with a minimum of 2 to a maximum of 4) and may occur in different years. For each election I have the name of the candidate, his vote share and the electoral lists that support him. Again, candidates may be supported by a different number of lists, with a minimum of 1 and a maximum of 8. I would like to create a variable that for each election would tell me in which position the candidate came, depending on his votes.

    I have created a unique identifier for each election (combining Municipality name and Year) and I've used the function "by election_id: egen rank = rank(candidate_votes),track" but when there are candidates with several lists, the code counts multiple observations, not allowing me to calculate the exact position of the candidate and I have to maintain the different electoral lists to make an analysis of the parties supporting each candidate.

    To make a pratical example I consider here the elections of Abano Terme in the 2011.

    rank election_id candidate_votes
    22 Abano Terme2011 327
    21 Abano Terme2011 381
    20 Abano Terme2011 498
    19 Abano Terme2011 606
    16 Abano Terme2011 1355
    16 Abano Terme2011 1355
    16 Abano Terme2011 1355
    11 Abano Terme2011 1900
    11 Abano Terme2011 1900
    11 Abano Terme2011 1900
    11 Abano Terme2011 1900
    11 Abano Terme2011 1900
    6 Abano Terme2011 2820
    6 Abano Terme2011 2820
    6 Abano Terme2011 2820
    6 Abano Terme2011 2820
    6 Abano Terme2011 2820
    1 Abano Terme2011 3442
    1 Abano Terme2011 3442
    1 Abano Terme2011 3442
    1 Abano Terme2011 3442
    1 Abano Terme2011 3442


    I want instead that my variable rank is in continuous descending order and considers candidates equal even if supported by different lists. It should therefore take the values:

    rank election_id candidate_votes
    8 Abano Terme2011 327
    7 Abano Terme2011 381
    6 Abano Terme2011 498
    5 Abano Terme2011 606
    4 Abano Terme2011 1355
    4 Abano Terme2011 1355
    4 Abano Terme2011 1355
    3 Abano Terme2011 1900
    3 Abano Terme2011 1900
    3 Abano Terme2011 1900
    3 Abano Terme2011 1900
    3 Abano Terme2011 1900
    2 Abano Terme2011 2820
    2 Abano Terme2011 2820
    2 Abano Terme2011 2820
    2 Abano Terme2011 2820
    2 Abano Terme2011 2820
    1 Abano Terme2011 3442
    1 Abano Terme2011 3442
    1 Abano Terme2011 3442
    1 Abano Terme2011 3442
    1 Abano Terme2011 3442

    and this repeated for every single election. Thanks in advance for the help.

  • #2
    Welcome to Statalist. There is an FAQ at the top of this page and it'd be great if you can refer to section 12 on how to share data with a command called dataex. It'd look something like this and is much easier for us to use.

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float rank str25 election_id float candidate_votes
    22 "Abano Terme2011"  327
    21 "Abano Terme2011"  381
    20 "Abano Terme2011"  498
    19 "Abano Terme2011"  606
    16 "Abano Terme2011" 1355
    16 "Abano Terme2011" 1355
    16 "Abano Terme2011" 1355
    11 "Abano Terme2011" 1900
    11 "Abano Terme2011" 1900
    11 "Abano Terme2011" 1900
    11 "Abano Terme2011" 1900
    11 "Abano Terme2011" 1900
     6 "Abano Terme2011" 2820
     6 "Abano Terme2011" 2820
     6 "Abano Terme2011" 2820
     6 "Abano Terme2011" 2820
     6 "Abano Terme2011" 2820
     1 "Abano Terme2011" 3442
     1 "Abano Terme2011" 3442
     1 "Abano Terme2011" 3442
     1 "Abano Terme2011" 3442
     1 "Abano Terme2011" 3442
    end
    Here is a possible solution to your case. By the way, you're listing "field rank" in the data but stated "track rank" in the command. Not sure which one is the truth, swap that option as you see fit.

    Code:
    egen uniq = tag(election_id candidate_votes)
    bysort election_id: egen wanted = rank(candidate_votes) if uniq, field
    bysort election_id candidate_votes: replace wanted = wanted[1] if missing(wanted)
    drop uniq
    Results:

    Code:
         +--------------------------------------------+
         | rank       election_id   candid~s   wanted |
         |--------------------------------------------|
      1. |   22   Abano Terme2011        327        8 |
         |--------------------------------------------|
      2. |   21   Abano Terme2011        381        7 |
         |--------------------------------------------|
      3. |   20   Abano Terme2011        498        6 |
         |--------------------------------------------|
      4. |   19   Abano Terme2011        606        5 |
         |--------------------------------------------|
      5. |   16   Abano Terme2011       1355        4 |
      6. |   16   Abano Terme2011       1355        4 |
      7. |   16   Abano Terme2011       1355        4 |
         |--------------------------------------------|
      8. |   11   Abano Terme2011       1900        3 |
      9. |   11   Abano Terme2011       1900        3 |
     10. |   11   Abano Terme2011       1900        3 |
     11. |   11   Abano Terme2011       1900        3 |
     12. |   11   Abano Terme2011       1900        3 |
         |--------------------------------------------|
     13. |    6   Abano Terme2011       2820        2 |
     14. |    6   Abano Terme2011       2820        2 |
     15. |    6   Abano Terme2011       2820        2 |
     16. |    6   Abano Terme2011       2820        2 |
     17. |    6   Abano Terme2011       2820        2 |
         |--------------------------------------------|
     18. |    1   Abano Terme2011       3442        1 |
     19. |    1   Abano Terme2011       3442        1 |
     20. |    1   Abano Terme2011       3442        1 |
     21. |    1   Abano Terme2011       3442        1 |
     22. |    1   Abano Terme2011       3442        1 |
         +--------------------------------------------+

    Comment


    • #3
      What happens if two or more candidates tie on number of votes? Otherwise put, which variable is the candidate identifier?

      Comment


      • #4
        Thank you very much Ken Chui, the code works! Also, I apologise for not using dataex, as this is the first post on the forum.
        To answer Nick Cox instead, in the case of a tie in votes (very few in the sample anyway) we go to a second round of balloting but I'm limiting my analysis to the elections that ended in the first round, so I don't have the problem of tied votes. However, as an identifier for the candidates I have their first and last names.

        Comment

        Working...
        X