Creating a prioritised ethnicity group variable

Katherine Richards

Join Date: Mar 2023

Posts: 6
#1

Creating a prioritised ethnicity group variable

13 Apr 2023, 17:11

Hi all

I have hospital admissions data, organised so each observation is an admission. There is a variable for ethnicity, but for some patients this is different between admissions. (Likely because the way people identify changes over time, and because we can record multiple ethnicities in our admissions data, and what is recorded may differ). I want to create a variable for prioritised ethnicity, using the priority allocation system used in my country.

The data looks like

id ethnicity other variables ...
1 3
1 3
2 10
3 11
3 15
3 11

I have generated a new prioritised ethnicity variable using the current ethnicity variable. I then want to replace prioritised ethnicity values with whichever value ranks higher in our allocation system.

I have tried:

generate ethnic2 = ethnic
label variable ethnic2 "Prioritised ethnic group - unified across admissions"

quietly by id: replace ethnic2 = 21 if ethnic2 == 21 // Maori is priority 1
quietly by id: replace ethnic2 = 35 if ethnic2 == 35 // Tokelauan is priority 2
quietly by id: replace ethnic2 = 36 if ethnic2 == 36 // Fijian is priority 3
....

I didn't think this was quite right and it wasn't - when I count before and after the commands the number of admissions with each ethnicity recorded doesn't change.

Any ideas how to do this?

Many thanks

Kate
Tags: None

Ken Chui

Join Date: Aug 2014
Posts: 1058

13 Apr 2023, 18:47

Code:

clear
input id ethnic
1 35
1 36
2 21
3 21
3 36
3 35
end

recode ethnic (21 = 1) (35 = 2) (36 = 3), into(priority)

gen ethnic_2 = .
bysort id (priority): replace ethnic_2 = ethnic[1]

list, sepby(id)

Results:

Code:

     +-----------------------------------+
     | id   ethnic   priority   ethnic_2 |
     |-----------------------------------|
  1. |  1       35          2         35 |
  2. |  1       36          3         35 |
     |-----------------------------------|
  3. |  2       21          1         21 |
     |-----------------------------------|
  4. |  3       21          1         21 |
  5. |  3       35          2         21 |
  6. |  3       36          3         21 |
     +-----------------------------------+

Comment

Katherine Richards

Join Date: Mar 2023

Posts: 6
#3

13 Apr 2023, 21:50

That's awesome thanks Ken
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35754
#4

14 Apr 2023, 02:59

Your rules don't state what would happen with a combination of say any of 3 10 11 15 and any of 21 35 36. The following is much more conservative than @Ken Chui's code.

Code:

gen any213536 = inlist(ethnicity, 21, 35, 36) bysort id (any213536 ethnicity) : gen ethnicity2 = ethnicity[1] if any213536[1] == 1 & any213536[_N] == 1

Ken's code would map any two, three or four of 3 10 11 15 to the smallest value for each person, as priority would be missing for all and the lowest value would thus be returned.

Other way round, if you flesh out Ken's code with a full set of priorities, you are good to go.
Comment

Announcement

Creating a prioritised ethnicity group variable

Comment

Comment

Comment