Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • creating a dummy variable based on percentage

    In the following sample dataset the house election result in United States are given from 2002-2020. Candidatevotes indicate the person who is representing the party variable ( democrat, republican, green , independent) how much vote they got and the totalvotes variable indicate how much vote that state-district has.

    I want to create an indicator of the incumbent House Representative being of the same party as the President ( democrat_pres variable tells if the year has a democrat president or not).

    Also, I want to create a competitive indicator which will hold 1 if the democratic vote share is 40-45% , will hold 2 if the democratic vote share is 46-50%, will hold 3 if democratic vote share is 51-55% , will hold 4 if the democratic vote share is 55-60% and will hold 5 if the democratic vote share is 60%.

    Can anyone kindly guide me how I can do the above ?

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input int year str20 state byte(district state_fips) str47 party long(candidatevotes totalvotes) float democrat_pres
    2016 "ALABAMA"     1 1 "REPUBLICAN"  208083 215893 1
    2016 "ALABAMA"     1 1 ""              7810 215893 1
    2016 "ALABAMA"     2 1 "REPUBLICAN"  134886 276584 1
    2016 "ALABAMA"     2 1 "DEMOCRAT"    112089 276584 1
    2016 "ALABAMA"     2 1 ""             29609 276584 1
    2016 "ALABAMA"     3 1 "DEMOCRAT"     94549 287104 1
    2016 "ALABAMA"     3 1 "REPUBLICAN"  192164 287104 1
    2016 "ALABAMA"     3 1 ""               391 287104 1
    2016 "ALABAMA"     4 1 "REPUBLICAN"  235925 239444 1
    2016 "ALABAMA"     4 1 ""              3519 239444 1
    2016 "ALABAMA"     5 1 "REPUBLICAN"  205647 308326 1
    2016 "ALABAMA"     5 1 "DEMOCRAT"    102234 308326 1
    2002 "CALIFORNIA"  6 6 "REPUBLICAN"   62052 209563 0
    2002 "CALIFORNIA"  6 6 "LIBERTARIAN"   4936 209563 0
    2002 "CALIFORNIA"  7 6 "REPUBLICAN"   36584 138376 0
    2002 "CALIFORNIA"  7 6 "DEMOCRAT"     97849 138376 0
    2002 "CALIFORNIA"  7 6 "LIBERTARIAN"   3943 138376 0
    2002 "CALIFORNIA"  8 6 "REPUBLICAN"   20063 160441 0
    2002 "CALIFORNIA"  8 6 "LIBERTARIAN"   2659 160441 0
    2002 "CALIFORNIA"  8 6 "GREEN"        10033 160441 0
    2002 "CALIFORNIA"  8 6 "DEMOCRAT"    127684 160441 0
    2002 "CALIFORNIA"  8 6 ""                 2 160441 0
    2002 "CALIFORNIA"  9 6 "DEMOCRAT"    135893 166917 0
    2002 "CALIFORNIA"  9 6 ""                 6 166917 0
    2002 "CALIFORNIA"  9 6 "LIBERTARIAN"   5685 166917 0
    2002 "CALIFORNIA"  9 6 "REPUBLICAN"   25333 166917 0
    2002 "CALIFORNIA" 10 6 "DEMOCRAT"    126390 167197 0
    2002 "CALIFORNIA" 10 6 "LIBERTARIAN"  40807 167197 0
    2020 "ARIZONA"     1 4 "REPUBLICAN"                       176709 365178 0
    2020 "ARIZONA"     1 4 "DEMOCRAT"                         188469 365178 0
    2020 "ARIZONA"     2 4 "DEMOCRAT"                         209945 381054 0
    2020 "ARIZONA"     2 4 "REPUBLICAN"                       170975 381054 0
    2020 "ARIZONA"     2 4 "WRITE-IN (COMMON SENSE MODERATE)"     35 381054 0
    2020 "ARIZONA"     2 4 "WRITE-IN (INDEPENDENT)"               99 381054 0
    2020 "ARIZONA"     3 4 "REPUBLICAN"                        95594 269837 0
    2020 "ARIZONA"     3 4 "DEMOCRAT"                         174243 269837 0
    2020 "ARIZONA"     4 4 "WRITE-IN (INDEPENDENT)"               39 398623 0
    2020 "ARIZONA"     4 4 "WRITE-IN (LIBERTARIAN)"               67 398623 0
    2020 "ARIZONA"     4 4 "DEMOCRAT"                         120484 398623 0
    2020 "ARIZONA"     4 4 "WRITE-IN (REPUBLICAN)"                 5 398623 0
    2020 "ARIZONA"     4 4 "WRITE-IN (INDEPENDENT)"                7 398623 0
    2020 "ARIZONA"     4 4 "WRITE-IN (DEMOCRATIC)"                19 398623 0
    2020 "ARIZONA"     4 4 "REPUBLICAN"                       278002 398623 0
    2020 "ARIZONA"     5 4 "REPUBLICAN"                       262414 445657 0
    end
    Last edited by Tariq Abdullah; 29 Nov 2022, 20:51.

  • #2
    Three questions:
    • how do we know the party of the incumbent House Representative?
    • what do you want to do with write-in candidates who are Democratic? Do they count towards the Democratic vote share, or do only the "Democrat" candidates count?
    • can you specify your vote share cut-offs more carefully? The range in your sample data starts from much lower (around 30%) and goes up much higher (over 80%). What do you want to do with those ranges? Also, do you want to round the percentage shares to the nearest number or always round it down? e.g. does 50.9% become 51% and is coded as 3, or does it get counted as a 2?

    Comment


    • #3
      1st questions answer: For a unique year in the unique district of a specific state out of all the candidate from different parties ( democrat, republican, green , independent ) whoever has the highest Candidatevotes indicate that person is the incumbent household representative, and at the same time the party variable indicates which party the representative is from. Like in the following subsample in the district 4 of Arizona state ( statefip ==4) of year 2020 the house representiave is from republiacan party since the person has the highest Candidatevotes amongstall the candidates in that district.

      Code:
      2020 "ARIZONA" 4 4 "WRITE-IN (INDEPENDENT)" 39 398623 0
      2020 "ARIZONA" 4 4 "WRITE-IN (LIBERTARIAN)" 67 398623 0
      2020 "ARIZONA" 4 4 "DEMOCRAT" 120484 398623 0
      2020 "ARIZONA" 4 4 "WRITE-IN (REPUBLICAN)" 5 398623 0
      2020 "ARIZONA" 4 4 "WRITE-IN (INDEPENDENT)" 7 398623 0
      2020 "ARIZONA" 4 4 "WRITE-IN (DEMOCRATIC)" 19 398623 0
      2020 "ARIZONA" 4 4 "REPUBLICAN" 278002 398623 0
      2nd question answer: I want to county write in candidate ( democratic) to be considered in the democrat vote share. My sincerest apology for not making this clear in #1 post. Thanks for bringing this to attention.

      3rd question answer: Clearly I haven't thought it through. Again, should have been more careful before writing the #1 post. Since it ranges from 30-80%. I want it to assign the dummy variable as : 1,2,3 ,4 ,5,6 and etc with increment of of 5%. Like 25-30% as 1 , 31-35% as 2 , 36-40% as 3 , 41-45% as 4 , and so on till 75-80%.

      If anything appears as 30.5% then I want the dummy variable to be assigned as 2. If 35.5% then the dummy variable should hold 3.

      Thanks again for raising this issue and giving me the chance to clear things up!

      Comment


      • #4
        Consider this code:
        Code:
        sort year state district candidatevotes
        by year state district: gen byte incumbent_democrat = strpos(party[_N],"DEMOCRAT") > 0
        gen byte incumbent_pres_match = (incumbent_democrat == democrat_pres)
         
        by year state district: egen democrat_votes = total(cond(strpos(party,"DEMOCRAT"),candidatevotes,.)), miss
        gen democrat_voteperc = democrat_votes/totalvotes * 100
        gen democrat_voteperc_rounded = round(democrat_voteperc)
        
        egen competitive = cut(democrat_voteperc_rounded), at(30(5)85) label
        drop incumbent_democrat democrat_votes democrat_voteperc_rounded
        where I am using simple rounding of the vote percentages to create the discrete variable. So 34.5% will get rounded to 35, but 34.49% will get rounded to 34, and will thus appear in different bins.

        Some of the output:

        Code:
        . list if state == "ARIZONA", noobs sepby(district) ab(20)
        
          +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
          | year     state   district   state_fips                              party   candidatevotes   totalvotes   democrat_pres   incumbent_pres_match   democrat_voteperc   competitive |
          |----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
          | 2020   ARIZONA          1            4                         REPUBLICAN           176709       365178               0                      0            51.61017           50- |
          | 2020   ARIZONA          1            4                           DEMOCRAT           188469       365178               0                      0            51.61017           50- |
          |----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
          | 2020   ARIZONA          2            4   WRITE-IN (COMMON SENSE MODERATE)               35       381054               0                      0            55.09587           55- |
          | 2020   ARIZONA          2            4             WRITE-IN (INDEPENDENT)               99       381054               0                      0            55.09587           55- |
          | 2020   ARIZONA          2            4                         REPUBLICAN           170975       381054               0                      0            55.09587           55- |
          | 2020   ARIZONA          2            4                           DEMOCRAT           209945       381054               0                      0            55.09587           55- |
          |----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
          | 2020   ARIZONA          3            4                         REPUBLICAN            95594       269837               0                      0            64.57343           65- |
          | 2020   ARIZONA          3            4                           DEMOCRAT           174243       269837               0                      0            64.57343           65- |
          |----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
          | 2020   ARIZONA          4            4              WRITE-IN (REPUBLICAN)                5       398623               0                      1            30.22982           30- |
          | 2020   ARIZONA          4            4             WRITE-IN (INDEPENDENT)                7       398623               0                      1            30.22982           30- |
          | 2020   ARIZONA          4            4              WRITE-IN (DEMOCRATIC)               19       398623               0                      1            30.22982           30- |
          | 2020   ARIZONA          4            4             WRITE-IN (INDEPENDENT)               39       398623               0                      1            30.22982           30- |
          | 2020   ARIZONA          4            4             WRITE-IN (LIBERTARIAN)               67       398623               0                      1            30.22982           30- |
          | 2020   ARIZONA          4            4                           DEMOCRAT           120484       398623               0                      1            30.22982           30- |
          | 2020   ARIZONA          4            4                         REPUBLICAN           278002       398623               0                      1            30.22982           30- |
          |----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
          | 2020   ARIZONA          5            4                         REPUBLICAN           262414       445657               0                      1                   .             . |
          +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
        Last edited by Hemanshu Kumar; 30 Nov 2022, 00:41.

        Comment


        • #5
          thanks so much for being so thoughtful with this generous piece of coding ! Really helped me a lot to get through this tricky problem I was having ! Appreciate your time and patience !

          Comment

          Working...
          X