Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • creating a race_ethnicity variable from multiple individual race variables

    I want to generate a race_ethnicity variable that when prompted, will return (tab) my frequencies for each race. I currently have individual race variables (race__1 = Asian, race__2 = American Indian/Alaska Native, race__3 = Black/African American, race__4 = Native Hawaiian Pacific Islander, race__5 = White, race__6 = other) and I have individual race/ethnicity variables (example: Non-Hispanic White, Non-Hispanic Black, etc.)

    I want to generate a variable that when prompted, it will return to me a table with the race/ethnicity frequencies for each one aka, how many NH White, NH Black, NH Asian respondents, etc. so that I don't have to individually tab each individual variable, but can do it in one command/prompt.

    Thank you!
    Last edited by Anna Gurolnick; 17 May 2021, 09:55. Reason: race ethnicity

  • #2
    Could you clarify on the followings?

    1) For those race__# variables, how are the "yes" and "no" coded?
    2) Can respondents check multiple races? E.g. "yes" for both race__1 and race__3.
    3) In your question, you already mentioned that "I have individual race/ethnicity variables (example: Non-Hispanic White, Non-Hispanic Black, etc.)" Why are you recreating it?

    Comment


    • #3
      Ken Chui they're coded with 0s/1s. Yes, they can check multiple races, so we also coded an additional "Mixed Race" variable. We created the non-hispanic "race" variables so we have them by race and also by race/ethnicity individually, but I would like to find a code that would work so I could tab/table the individual race/ethnicity variables all at the same time if this is possible.

      I am essentially looking for the code that would help cut the labor in half from having to tab each one individually to being able to run them all at the same time.

      Thank you!

      Comment


      • #4
        I'm not sure if there are any program to do that, but generally it can be done with some carefully laid out -replace- commands. E.g.:

        Code:
        clear
        input race__1 race__2 race__3 race__4 race__5 race__6 hispanic
        1 0 0 0 0 0 0
        0 1 0 0 0 0 0
        0 0 1 0 0 0 0
        0 0 1 0 0 0 1
        0 1 1 0 0 0 0
        end
        
        * First, find out who filled in more than 1 race:
        egen multirace = anycount(race__1-race__6), v(1)
        
        * Then use basic replace:
        gen race = .
        replace race = 1 if race__1 == 1
        replace race = 2 if race__2 == 1
        replace race = 3 if race__3 == 1 & hispanic == 0
        replace race = 4 if race__3 == 1 & hispanic == 1
        replace race = 5 if race__4 == 1
        replace race = 6 if race__5 == 1 & hispanic == 0
        replace race = 7 if race__5 == 1 & hispanic == 1
        replace race = 8 if race__6 == 1
        replace race = 9 if multirace > 1 & multirace < .
        First use anycount in -egen- to find who checked more than one race. Then use a chain of -replace- to create the new race variable. Basing on my own practice in the US, I only use the Hispanic and non-Hispanic modifier for blacks and whites only, but you can revise the code as you see fit. At the end, add a final line to recode those with multiple races (this needs to come last or the later race category may accidentally become the final code.)

        After that, you can look into -label define- and -label values- to give a label to all the codes in race.

        Comment

        Working...
        X