Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Renaming duplicates observations in panel Survey dataset

    Dear all,

    I have a large survey panel dataset on individuals with duplicates observations (in some cases, the same individual name appears tow times, three times or more in several different areas ). There is no individual ID number or ID name in the dataset. I would like to remane the duplicates. Please, are there a Stata command to rename duplicates in panel data?

    Thank you for your help.

    Best regards,

    Clarisse

  • #2
    -duplicates drop- helps drop observations of complete redundancy. Other than that, you may be cautious as it's possible for different individuals to own identical names.

    Comment


    • #3
      your situation is not entirely clear to me (please follow the advice in the FAQ and use -dataex- to show data examples), but my guess is that the following would be at least as convenient: use, e.g.,
      Code:
      bys ind_name: gen count=_n
      which will give you a new variable (I called it "count" but you could call it whatever you wanted) to just give you a running count of the number of occurrences; the combination of name (I called it ind_name as you did not tell us the actual variable name) and count should give you distinct id's for each observation; note that if you have something else, e.g., a date that would help distinguish, you should include those

      Comment


      • #4
        Thank you Rich.
        This code works but I have to repeat it for each duplicate. It is tedious because I have hundreds of duplicate. Is there a way to rename all duplicates at the same time?

        Thank you.

        Best,

        Clarisse

        Comment


        • #5
          I'm confused and think that only a data example will help; see
          Code:
          help dataex

          Comment

          Working...
          X