Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Aggregating and percentage distribution simultaneously

    Hello,

    I am looking for a way equivalent to SPSS (AGGREGATE with PIN) syntax
    I have data

    id zipcode race
    1 11364 white
    2 11364 black
    3 98006 white
    4 98006 asian

    Then, I want to have
    zipcode count asian black white
    11364 2 0% 50% 50%
    98006 2 50% 0% 50%

    Can anyone please have an idea how to get this?

    Thank you,

  • #2
    The following example code may start you in a useful direction.
    Code:
    cls
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input byte id str5(zipcode race)
    1 "11364" "white"
    2 "11364" "black"
    3 "98006" "white"
    4 "98006" "asian"
    5 "99345" "white"
    6 "99345" "white"
    7 "99345" "black"
    end
    
    collapse (count) n=id, by(zipcode race)
    list, clean
    by zipcode: egen pop = total(n)
    generate pct = n/pop
    drop n
    format pct %9.2f
    list, clean
    reshape wide pct, i(zipcode) j(race) string
    order zipcode pop
    recode pct* (.=0)
    rename (pct*) (*)
    list, clean
    Code:
    . list, clean
    
           zipcode   pop   asian   black   white  
      1.     11364     2    0.00    0.50    0.50  
      2.     98006     2    0.50    0.00    0.50  
      3.     99345     3    0.00    0.33    0.67

    Comment


    • #3
      Here's one way to do it.

      Code:
      clear
      input id zipcode str5 race
      1 11364 white
      2 11364 black
      3 98006 white
      4 98006 asian
      end
      
      contract race zipcode
      
      reshape wide _freq , i(zipcode) j(race) string
      
      mvencode _freq*, mv(0)
      
      egen total = rowtotal(_freq*)
      quietly foreach v of var _freq* {
          replace `v' = 100 * `v'/total
      }
      
      rename (_freq*) (*)
      
      list
      
           +-----------------------------------------+
           | zipcode   asian   black   white   total |
           |-----------------------------------------|
        1. |   11364       0      50      50       2 |
        2. |   98006      50       0      50       2 |
           +-----------------------------------------+
      FWIW, I don't think many of the most frequent posters here are also expert in SPSS. I haven't used it for about 40 years, so remember nothing. So, giving examples of what you want is more helpful than citing SPSS syntax.

      (For some reason I didn't see @William Lisowski's post when preparing mine. I had this page open for some while while finishing something else.)
      Last edited by Nick Cox; 26 Feb 2019, 11:16.

      Comment


      • #4
        Thank you very much! Very helpful !

        Comment

        Working...
        X