Creating a list of unique "values" from different variables

Alejandra Hernandez

Join Date: Feb 2017

Posts: 3
#1

Creating a list of unique "values" from different variables

13 Feb 2017, 03:20

Hello,

I am working with a dataset with the names of 20 different medications taken by more than 1,000 patients. I need to know the list of all the medications distributed in 20 string variables.

For example:

Person Medication1 Medication2 Medication3 Medication4 Medication5...

id1 a b c

id2 b d e

id3 c a f g h

Result: List of unique values

a

b

c

d

e

f

g

h

I have checked previous forums on how to compute the number of different observations but unfortunately have not succeeded.
http://www.stata.com/support/faqs/da...-observations/

In particular, I have tried with the following:
by medication1 medication2, sort: gen nvals= _n ==1
count if nvals
replace nvals=sum(nvals)
replace nvals= nvals[_N]

Any help would be greatly appreciated.

Kind regards,
Alejandra
Tags: None
Charlie Joyez

Join Date: Dec 2014

Posts: 421
#2

15 Feb 2017, 12:48

Alejandra, this is not the appropriate section of the forum to get answered.
Please re-post it into the general section to increase the probability ger replied.

For your particular question : It is not very clear to me.
What a, b, c etc. letters stand for? Are these the medication names? And if you know already that there is 20 medications, why do you need to recover the list from this way?
Or a,b,c etc. is another information (e.g. amount taken) and medication1 to medication20 are the 20 medicament variables.
In this case, the first step would be to do

Code:

tab medication1

Which would list all values taken by a,b,c etc. for the variable medication1. If you haven't much distinct values for each medication, you could easily compile them by doing a tabulate for each variable.
Otherwise, a reshape long command might help to have only one medication variable, and then do the above tab medication command on the single variable.

See

Code:

help reshape

Best,
Charlie
Comment
Alejandra Hernandez

Join Date: Feb 2017

Posts: 3
#3

20 Feb 2017, 09:16

Hi Charlie, thanks a lot for your message.

Let me try to explain better my question. All the patients in my study have been asked to include up to 20 medications they are currently taking. So I am not sure the number of "unique" medications in my dataset. The "a", "b", "c"... are the medication names.

I think the reshape long command will help and then tab.

Thanks for the advice Charlie.

Kind regards,
Alejandra
Comment
Charlie Joyez

Join Date: Dec 2014

Posts: 421
#4

20 Feb 2017, 09:46

Ok, this is clearer now, and yes what you should do is a reshape long then tab.

Code:

reshape long Medication, i(Person) j(meds) tab meds

Best,
Charlie
Comment
Alejandra Hernandez

Join Date: Feb 2017

Posts: 3
#5

24 Feb 2017, 09:08

Thanks a lot Charlie!
Comment
Charlie Joyez

Join Date: Dec 2014

Posts: 421
#6

26 Feb 2017, 10:36

You're welcome, but remember next time to post in the appropriate section (general).
You'll get answered much faster.
Best,
Charlie
Comment

Person	Medication1	Medication2	Medication3	Medication4	Medication5...
id1	a	b	c
id2	b	d	e
id3	c	a	f	g	h

Result: List of unique values
	a
	b
	c
	d
	e
	f
	g
	h

Announcement

Creating a list of unique "values" from different variables

Comment

Comment

Comment

Comment

Comment