Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Drop all duplicates and not keep the first

    I want to drop all duplicates, and not keep any of the observations (not even the first). I have a patient list where the patients can be in three different groups, but I only want a list of the patients that are not in either group 1 and group 2 or group 1 and group 3, so only the patients that are only in group 1. I used the duplicates drop command with patient ID as variable but this keeps the first observations when there are duplicates.

  • #2
    Cevin:
    welcome to this forum.
    Could you please share with interested listers an example/excerpt of your dataset via -datex-? Thanks.
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Use duplicates tag to create a variable that marks all observations that are duplicated, then drop all those observations for which that variable is positive.

      Comment


      • #4
        I am as positive about duplicates as anyone else, but in this instance you need just

        Code:
        bysort foo bar bazz : keep if _N == 1
        where you keep observations with the same values of foo bar bazz if and only if that combination of values occurs just once.


        The duplicates command arose because (a) people who knew enough about by: just kept repeating the same advice to use a few commands that almost always hinged on a sort order and by:: (b) people newer to Stata couldn't fairly be expected to think up those few commands for a rather common task.

        Naturally I agree with those asking for a (realistic) data example.

        Comment


        • #5
          Edit: Removed comment, as I had earlier misunderstood the code in #4.

          Comment

          Working...
          X