I have a dataset with a string variable "x" that assumes the same value several times. I want to create two new variables:
This is a simple example of the logic I used:
use dataset
keep x
bysort x: egen count_x=count(x)
egen ranking_x=rank(count_x), field
list x count_x ranking_x
Below an example of the obtained and desired values I got:
I understand that it would work if I used following logic:
use dataset
keep x
bysort x: egen count_x=count(x)
bysort x: keep if _n == 1
egen ranking_x=rank(count_x), field
list x count_x ranking_x
However, since I'm using further variables besides x in my analysis, I'm looking for a solution that allows me to keep all the observations.
Thanks for your help!
- A count variable "count_x": frequency that a certain value of "x" appears
- A ranking variable "ranking_x": ranking of appearances for a certain value of "x"
This is a simple example of the logic I used:
use dataset
keep x
bysort x: egen count_x=count(x)
egen ranking_x=rank(count_x), field
list x count_x ranking_x
Below an example of the obtained and desired values I got:
x | count_x | ranking_x | desired ranking_x |
CESAR | 3 | 11 | 3 |
CESAR | 3 | 11 | 3 |
CESAR | 3 | 11 | 3 |
JOHN | 6 | 1 | 1 |
JOHN | 6 | 1 | 1 |
JOHN | 6 | 1 | 1 |
JOHN | 6 | 1 | 1 |
JOHN | 6 | 1 | 1 |
JOHN | 6 | 1 | 1 |
MAX | 1 | 14 | 4 |
PAUL | 4 | 7 | 2 |
PAUL | 4 | 7 | 2 |
PAUL | 4 | 7 | 2 |
PAUL | 4 | 7 | 2 |
use dataset
keep x
bysort x: egen count_x=count(x)
bysort x: keep if _n == 1
egen ranking_x=rank(count_x), field
list x count_x ranking_x
However, since I'm using further variables besides x in my analysis, I'm looking for a solution that allows me to keep all the observations.
Thanks for your help!
Comment