Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Removing duplicates

    Hi,

    I can't figure out what I'm doing wrong.

    I have 100+ duplicates (people filled out the survey multiple times) in my dataset. The variable that indicates the duplicate is called "ID" and is a string variable. I inspected all duplicates, and I want to remove either the first or second occurrence. I worked out the following code but I keep getting the error "too few quotes".

    Here is my code:

    sort ID
    quietly by ID: gen dup = _n if _N>1. //labels the occurrence

    local ID1 "0372552" "0392169" "0414180" "0415160" "0421180" "1266329" "1447648" "1450152" "1501119"

    gen to_drop=0
    foreach e of local ID1 {
    replace to_drop = 1 if EMPLID == " `e' " & dup == 2
    }

    I also tried: foreach e in ID1 - the code runs but doesn't correctly identify the strings. My to_drop variable only contains zeros.

    Any ideas?

    Thanks!

  • #2
    The route you've chosen obliges you to go further.

    Code:
    local ID1 `" "0372552" "0392169" "0414180" "0415160" "0421180" "1266329" "1447648" "1450152" "1501119" "'
    Stata Rule: The outermost quotation marks are taken to be string delimiters and are stripped on reading.

    That is fine with

    Code:
    local beasts "frog toad newt"
    whenever you want frog toad newt to be the contents of the macro and are happy that the double quotes disappear into the dust.

    That is not fine whenever you want double quotes to be preserved within strings.

    Yet the internal quotes in your case look redundant, so you can go the other way. as

    Code:
    local ID1 0372552 0392169 0414180 0415160 0421180 1266329 1447648 1450152 1501119
    looks good.


    Comment


    • #3
      I am not sure, but wondering if a simple code like
      Code:
      duplicates drop ID, force
      will work.

      Comment


      • #4
        Indeed. As its putative author, I forgot to ask why duplicates isn't a solution here.

        Comment

        Working...
        X