Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Isolating highest value of a variable using collapse command


    I am working with a dataset of national election data from 2020. It includes the total votes received by each candidate at the county level for all 50 states



    I would like to collapse the dataset to show the winning candidate of each County but I'm having trouble identifying a command to do what I am trying to. I want Stata to drop identify the candidate who received the highest votes in that county and drop the other observations. Using the attached screenshot as an example, I am trying to get Stata to identify that Trump had the highest amount of votes in Autauga County, Alabama, and drop the observations for Biden and Other, then repeat that for every other county/state combination in the dataset.

    I used collapse (max) n_candidatevotes, by (county_name2 state2 candidate2) to collapse down to these four variables, but I'm having a difficult time figuring out how to collapse down further to my desired results.
    Attached Files

  • #2
    Code:
    bysort state county: egen max = max(n_cand)
    keep if n_cand==max
    drop max

    Comment


    • #3
      Code:
      bysort state county (n_cand) : keep if n_cand == n_cand[_N]
      should work too (including catching any ties for maximum).

      Comment

      Working...
      X