Renaming duplicates observations in panel Survey dataset

Clarisse Nguedam

Join Date: Jun 2017

Posts: 18
#1

Renaming duplicates observations in panel Survey dataset

01 Dec 2021, 10:26

Dear all,

I have a large survey panel dataset on individuals with duplicates observations (in some cases, the same individual name appears tow times, three times or more in several different areas ). There is no individual ID number or ID name in the dataset. I would like to remane the duplicates. Please, are there a Stata command to rename duplicates in panel data?

Thank you for your help.

Best regards,

Clarisse
Tags: None
Fei Wang

Join Date: Oct 2021

Posts: 726
#2

01 Dec 2021, 10:32

-duplicates drop- helps drop observations of complete redundancy. Other than that, you may be cautious as it's possible for different individuals to own identical names.
Comment
Rich Goldstein

Join Date: Mar 2014

Posts: 4548
#3

01 Dec 2021, 10:48

your situation is not entirely clear to me (please follow the advice in the FAQ and use -dataex- to show data examples), but my guess is that the following would be at least as convenient: use, e.g.,

Code:

bys ind_name: gen count=_n

which will give you a new variable (I called it "count" but you could call it whatever you wanted) to just give you a running count of the number of occurrences; the combination of name (I called it ind_name as you did not tell us the actual variable name) and count should give you distinct id's for each observation; note that if you have something else, e.g., a date that would help distinguish, you should include those
Comment
Clarisse Nguedam

Join Date: Jun 2017

Posts: 18
#4

02 Dec 2021, 07:08

Thank you Rich.
This code works but I have to repeat it for each duplicate. It is tedious because I have hundreds of duplicate. Is there a way to rename all duplicates at the same time?

Thank you.

Best,

Clarisse
Comment
Rich Goldstein

Join Date: Mar 2014

Posts: 4548
#5

02 Dec 2021, 12:54

I'm confused and think that only a data example will help; see

Code:

help dataex
Comment

Announcement

Renaming duplicates observations in panel Survey dataset

Comment

Comment

Comment

Comment