Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to drop entire observations with duplicates for a particular variable keeping the original (or the first observation

    I have data from a survey where some participants opened the survey twice, thrice or more. How do I drop the data for the second and onwards entries regardless of whether they completed the survey or not at a go for any entry. I want to only keep the first observation. These participants are identified by the ID number, var(id). So
    id surveyprogress%
    1 23
    1 12
    1 56
    2 13
    In this case, I would want to drop the third and fourth rows because the share the same ID number as row 1.

    Please let me know, thank so much in advance!
    Last edited by anisha arya; 31 Jul 2024, 13:24.

  • #2
    Assuming the dataset is already sorted in the correct order so that the chronologically first observation is also the physically first one, it's easy:
    Code:
    sort id, stable
    by id: keep if _n == 1

    Comment

    Working...
    X