Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Comparing duplicates to generate new variable

    Hi all,

    I have a table which consists of duplicates/ pairs in the column "CompanyYear".

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str50 EXEC_FULLNAME str32 CompanyYear float minyear
    "Eric A. Benhamou"       "3COM CORP2000" 1993
    "Michael E. Rescoe"      "3COM CORP2000" 2000
    "Michael E. Rescoe"      "3COM CORP2001" 2000
    "Bruce L. Claflin"       "3COM CORP2001" 2001
    "Mark Slaven"            "3COM CORP2002" 2002
    "Bruce L. Claflin"       "3COM CORP2002" 2001
    "Mark Slaven"            "3COM CORP2003" 2002
    "Bruce L. Claflin"       "3COM CORP2003" 2001
    "Donald M. Halsted, III" "3COM CORP2004" 2004
    "Bruce L. Claflin"       "3COM CORP2004" 2001
    end
    Everytime there is a duplicate/ pair in the column "CompanyYear", I want Stata to compare the "minyear" for both entries and make an entry in the newly generated variable "later".

    Example: Eric Benhamou and Michael Rescoe both have an entry with "3COM CORP2000" in the column "CompanyYear". Yet, the "minyear" for Eric Benhamou is 1993 while for Michael Rescoe it is 2000. Thus, I want Stata to mark Eric Benhamou with a "0" in the column "later" and Michael Rescoe with a "1". Hence, always the person which has the higher "minyear" in the comparison of two duplicates should get a "1" in the column "later", while the other person gets a "0".

    Thanks for your comments.

    Best
    Matthias

  • #2
    This seems to do what you seek.
    Code:
    . bysort CompanyYear (minyear): generate later = _n==_N
    
    . list, noobs sepby(CompanyYear)
    
      +----------------------------------------------------------+
      |          EXEC_FULLNAME     CompanyYear   minyear   later |
      |----------------------------------------------------------|
      |       Eric A. Benhamou   3COM CORP2000      1993       0 |
      |      Michael E. Rescoe   3COM CORP2000      2000       1 |
      |----------------------------------------------------------|
      |      Michael E. Rescoe   3COM CORP2001      2000       0 |
      |       Bruce L. Claflin   3COM CORP2001      2001       1 |
      |----------------------------------------------------------|
      |       Bruce L. Claflin   3COM CORP2002      2001       0 |
      |            Mark Slaven   3COM CORP2002      2002       1 |
      |----------------------------------------------------------|
      |       Bruce L. Claflin   3COM CORP2003      2001       0 |
      |            Mark Slaven   3COM CORP2003      2002       1 |
      |----------------------------------------------------------|
      |       Bruce L. Claflin   3COM CORP2004      2001       0 |
      | Donald M. Halsted, III   3COM CORP2004      2004       1 |
      +----------------------------------------------------------+

    Comment


    • #3
      Thank you William, your reply was as always very helpful. This command is exactly what I needed

      Comment

      Working...
      X