Generate tag "duplicate" for all the values of variable ID if at least one value is duplicated

Janka Vanschoenwinkel

Join Date: Jan 2016

Posts: 49
#1

Generate tag "duplicate" for all the values of variable ID if at least one value is duplicated

28 Jan 2016, 07:54

Dear colleagues,

I recently posted a post on duplicated values. http://www.statalist.org/forums/foru...-in-panel-data

I got useful feedback but I also realized that for some duplicates, I will have to intervene manually. In order to do this efficiently, I would like to sort out only one specific type of duplicate. Namely, duplicates of a3 and year. This is easy in STATA

duplicates tag a3 year, gen(isdup)

The result is in appendix!

I can now sort isdup and see all the problems. However, in order to correct the data correctly, I need to have ALL the data of each farmer (a3) that contains at least one duplicate. So basically, I want to see what I posted in the picture below, and not only the two duplicates on their own.

Is it therefore possible to generate a duplicate that gives a value to ALL observations of a farmer if at least one observation of that farmer contains a duplicate?

Thank you very much again!

Janka

1 Photo
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30097
#2

28 Jan 2016, 09:49

So, I assume that somewhere in your data set you have a variable that identifies farmers. Let's call it farmer.

Code:

duplicates tag a3 year, gen(isdup) egen has_a_dup = max(isdup), by(farmer)

By the way, in the future please don't post screen shots to show data examples. This one happens to be readable, but often they are not. Even when readable, if it were necessary to experiment with the data, nobody can recreate the data example by copying and pasting--it has to be retyped in by hand, which is tedious and error-prone. The way to share data examples is to use the -dataex- command. If you do not already have that installed, run -ssc install dataex- and read -help dataex- to learn how to use it. It's easy! Thanks.
Comment
Janka Vanschoenwinkel

Join Date: Jan 2016

Posts: 49
#3

28 Jan 2016, 12:02

Clyde! This is absolutely perfect! Thank you very very much!

I'll take into account your comment about the screen shots as well in the future!

Thank you very much!
Comment

Announcement

Generate tag "duplicate" for all the values of variable ID if at least one value is duplicated

Comment

Comment