Cleaning up duplicates in long data

Tom Lawson

Join Date: Jun 2022

Posts: 13
#1

Cleaning up duplicates in long data

10 Oct 2022, 08:51

Hello,
I have a data set with repeated measures in long format which I would like to reshape to wide form like this:
reshape wide rass cam_f1 cam_f2_num cam_f3_num cam_f4_num CAMICUv3 cam_7 cam_aphasia, i(record_id) j(cam_date_num)

Unfortunately I don't yet have a unique identifier for my "j" within each subject. I have a date for each repeated measure, which I've been able to destring into [almost] a unique identifier for this purpose, however I have several cases in which the date is duplicated and I would like to just drop these duplicates. The problem with just using "duplicates drop cam_date_num," is the for all the dates, there will be overlap where subject 1 had this assessment done on the same date subject 2 also had an assessment (which for my purposes is not a true duplicate). Is there a way to identify and drop duplicates within a single subject?

I'm using Stata/IC 14.2, on Microsoft remote desktop in a MacBook

Thank You,
Tom Lawson
Tags: None
Hemanshu Kumar

Join Date: Mar 2015

Posts: 1548
#2

10 Oct 2022, 09:33

What about just

Code:

duplicates drop record_id cam_date_num, force

where I am assuming record_id identifies the subject.
1 like
Comment
Tom Lawson

Join Date: Jun 2022

Posts: 13
#3

11 Oct 2022, 09:45

Thank you Hemanshu; I love that there was a simple way to make this happen. This worked fairly well, but also dropped many observations in other variables. But, for the purpose of setting up my wide data I didn't need them anyway
Thanks for your help!

Tom
Comment

Announcement

Cleaning up duplicates in long data

Comment

Comment