Dear colleagues,
I recently posted a post on duplicated values. http://www.statalist.org/forums/foru...-in-panel-data
I got useful feedback but I also realized that for some duplicates, I will have to intervene manually. In order to do this efficiently, I would like to sort out only one specific type of duplicate. Namely, duplicates of a3 and year. This is easy in STATA
duplicates tag a3 year, gen(isdup)
The result is in appendix!
I can now sort isdup and see all the problems. However, in order to correct the data correctly, I need to have ALL the data of each farmer (a3) that contains at least one duplicate. So basically, I want to see what I posted in the picture below, and not only the two duplicates on their own.
Is it therefore possible to generate a duplicate that gives a value to ALL observations of a farmer if at least one observation of that farmer contains a duplicate?
Thank you very much again!
Janka
I recently posted a post on duplicated values. http://www.statalist.org/forums/foru...-in-panel-data
I got useful feedback but I also realized that for some duplicates, I will have to intervene manually. In order to do this efficiently, I would like to sort out only one specific type of duplicate. Namely, duplicates of a3 and year. This is easy in STATA
duplicates tag a3 year, gen(isdup)
The result is in appendix!
I can now sort isdup and see all the problems. However, in order to correct the data correctly, I need to have ALL the data of each farmer (a3) that contains at least one duplicate. So basically, I want to see what I posted in the picture below, and not only the two duplicates on their own.
Is it therefore possible to generate a duplicate that gives a value to ALL observations of a farmer if at least one observation of that farmer contains a duplicate?
Thank you very much again!
Janka
Comment