Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Cleaning up duplicates in long data

    Hello,
    I have a data set with repeated measures in long format which I would like to reshape to wide form like this:
    reshape wide rass cam_f1 cam_f2_num cam_f3_num cam_f4_num CAMICUv3 cam_7 cam_aphasia, i(record_id) j(cam_date_num)

    Unfortunately I don't yet have a unique identifier for my "j" within each subject. I have a date for each repeated measure, which I've been able to destring into [almost] a unique identifier for this purpose, however I have several cases in which the date is duplicated and I would like to just drop these duplicates. The problem with just using "duplicates drop cam_date_num," is the for all the dates, there will be overlap where subject 1 had this assessment done on the same date subject 2 also had an assessment (which for my purposes is not a true duplicate). Is there a way to identify and drop duplicates within a single subject?


    I'm using Stata/IC 14.2, on Microsoft remote desktop in a MacBook

    Thank You,
    Tom Lawson


  • #2
    What about just
    Code:
    duplicates drop record_id cam_date_num, force
    where I am assuming record_id identifies the subject.

    Comment


    • #3
      Thank you Hemanshu; I love that there was a simple way to make this happen. This worked fairly well, but also dropped many observations in other variables. But, for the purpose of setting up my wide data I didn't need them anyway
      Thanks for your help!

      Tom

      Comment

      Working...
      X