Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Deleting duplicate observations under certain conditions

    Hello again. I am working with trade agreement data, though some country-pairs have duplicate entries for a given year in the case that the trade agreement has been amended/updated. I would like to keep only the most recent trade agreement for each pair in a given year. Below is an example of what the data looks like:

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str3(ex im) int(year entry_force_year) str12 name
    "DEU" "ITA" 1994 1994 "EC Treaty"   
    "DEU" "ITA" 1995 1994 "EC Treaty"   
    "DEU" "ITA" 1996 1994 "EC Treaty"   
    "DEU" "ITA" 1997 1994 "EC Treaty"   
    "DEU" "ITA" 1998 1994 "EC Treaty"   
    "DEU" "ITA" 1999 1994 "EC Treaty"   
    "DEU" "ITA" 1999 1999 "EU"          
    "DEU" "ITA" 2000 1994 "EC Treaty"   
    "DEU" "ITA" 2000 1999 "EU"          
    "DEU" "ITA" 2001 1994 "EC Treaty"   
    "DEU" "ITA" 2001 1999 "EU"          
    "DEU" "ITA" 2002 2002 "EU Expansion"
    "DEU" "ITA" 2002 1999 "EU"          
    "DEU" "ITA" 2002 1994 "EC Treaty"   
    end
    Where "ex" and "im" denote exporter and importer, respectively, entry_force_year denotes the first year the agreement goes into force, and name denotes the name of the agreement. As you can see, there are duplicates in ex, im, and year whenever a new trade agreement replaces an old one. Ideally, the data set would look like this:

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str3(ex im) int(year entry_force_year) str12 name
    "DEU" "ITA" 1994 1994 "EC Treaty"   
    "DEU" "ITA" 1995 1994 "EC Treaty"   
    "DEU" "ITA" 1996 1994 "EC Treaty"   
    "DEU" "ITA" 1997 1994 "EC Treaty"   
    "DEU" "ITA" 1998 1994 "EC Treaty"   
    "DEU" "ITA" 1999 1999 "EU"          
    "DEU" "ITA" 2000 1999 "EU"         
    "DEU" "ITA" 2001 1999 "EU"          
    "DEU" "ITA" 2002 2002 "EU Expansion"
    end
    Essentially, the new trade agreement takes the place of the old one. I am having a hard time figuring out where to start, though I suspect the solution won't be too complex. Thank you in advance!

  • #2
    Code:
    by ex im name (year), sort: gen int inception = year[1]
    by ex im year (inception), sort: keep if _n == _N

    Comment


    • #3
      Originally posted by Clyde Schechter View Post
      Code:
      by ex im name (year), sort: gen int inception = year[1]
      by ex im year (inception), sort: keep if _n == _N
      This code is amazingly simple and useful, as always. Thank you!

      Comment

      Working...
      X