Grouping unique values of one variable by a second variable

Jared Thorpe

Join Date: Sep 2017

Posts: 2
#1

Grouping unique values of one variable by a second variable

22 Sep 2017, 15:41

Hello, Stata Community,

I am having trouble getting Stata to do what I need it to. I have a dataset with a long list of company names and a variable that identifies unique compounds. The issue I have is that the unique compound identifier is market level, so the same value may appear several times for the same company and again for separate companies. What I am trying to do is generate a new variable which assigns company level unique compound identifiers based on grouping the market level unique identifiers by company.

For example:
My data looks like this:

Code:

input byte unique str20 company 37 "Aurobindo Pharma" 42 "Aurobindo Pharma" 42 "Aurobindo Pharma" 20 "Avanthi Inc" 20 "Avema Pharma" 4 "Barr" 4 "Barr" 4 "Barr" 4 "Barr" 4 "Barr" 35 "Barr" end

And what I need is:

Code:

input byte unique str20 company byte cunique 37 "Aurobindo Pharma" 1 42 "Aurobindo Pharma" 2 42 "Aurobindo Pharma" 2 20 "Avanthi Inc" 1 20 "Avema Pharma" 1 4 "Barr" 1 4 "Barr" 1 4 "Barr" 1 4 "Barr" 1 4 "Barr" 1 35 "Barr" 2 end

If I could use the by command with the egen, group() command it would be exactly what I need. Anyone know a way around this?

Thanks!

Last edited by Jared Thorpe; 22 Sep 2017, 15:48.
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 28624
#2

22 Sep 2017, 16:39

Code:

by company (unique), sort: gen cunique = sum(unique != unique[_n-1])
1 like
Comment
Jared Thorpe

Join Date: Sep 2017

Posts: 2
#3

22 Sep 2017, 16:46

Thank you!!!
Comment

Announcement

Grouping unique values of one variable by a second variable

Comment

Comment