Variable for _n per distinct id

Hanna Larsson

Join Date: Jul 2023

Posts: 11
#1

Variable for _n per distinct id

14 Aug 2023, 08:26

Hi!
I have a problem I would appreciate your help with. My data (>700 000 rows) consists of the variables id, id2 and var and I want to generate newvar (se below).
Newvar should be based on the number of distinct id2 per id, starting with 1 and counting upwards - order defined by the quantity of var.

data
id id2 var newvar

1 1 1000 1

1 2 2000 2

2 3 1200 1

2 3 8000 1

If I use _n it counts each row per id2. I want newvar to be the same for each id2.

Thank you so much!
Tags: None
Andrew Musau

Join Date: Oct 2014

Posts: 10280
#2

14 Aug 2023, 08:33

order defined by the quantity of var

It is not clear what this means if "id2" does not prescribe the order. You need to expand on this point.

Code:

bys id (id2): gen wanted= sum(id2!=id2[_n-1])

Last edited by Andrew Musau; 14 Aug 2023, 08:37.
Comment
Hanna Larsson

Join Date: Jul 2023

Posts: 11
#3

14 Aug 2023, 08:38

Originally posted by Andrew Musau View Post

It is not clear what this means if "id2" does not prescribe the order. You need to expand on this point.

Code:

bys id (id2): gen wanted= sum(id2!=id2[_n-1])

Thank you!!!
Comment

Announcement