Hello again. I am working with trade agreement data, though some country-pairs have duplicate entries for a given year in the case that the trade agreement has been amended/updated. I would like to keep only the most recent trade agreement for each pair in a given year. Below is an example of what the data looks like:
Where "ex" and "im" denote exporter and importer, respectively, entry_force_year denotes the first year the agreement goes into force, and name denotes the name of the agreement. As you can see, there are duplicates in ex, im, and year whenever a new trade agreement replaces an old one. Ideally, the data set would look like this:
Essentially, the new trade agreement takes the place of the old one. I am having a hard time figuring out where to start, though I suspect the solution won't be too complex. Thank you in advance!
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input str3(ex im) int(year entry_force_year) str12 name "DEU" "ITA" 1994 1994 "EC Treaty" "DEU" "ITA" 1995 1994 "EC Treaty" "DEU" "ITA" 1996 1994 "EC Treaty" "DEU" "ITA" 1997 1994 "EC Treaty" "DEU" "ITA" 1998 1994 "EC Treaty" "DEU" "ITA" 1999 1994 "EC Treaty" "DEU" "ITA" 1999 1999 "EU" "DEU" "ITA" 2000 1994 "EC Treaty" "DEU" "ITA" 2000 1999 "EU" "DEU" "ITA" 2001 1994 "EC Treaty" "DEU" "ITA" 2001 1999 "EU" "DEU" "ITA" 2002 2002 "EU Expansion" "DEU" "ITA" 2002 1999 "EU" "DEU" "ITA" 2002 1994 "EC Treaty" end
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input str3(ex im) int(year entry_force_year) str12 name "DEU" "ITA" 1994 1994 "EC Treaty" "DEU" "ITA" 1995 1994 "EC Treaty" "DEU" "ITA" 1996 1994 "EC Treaty" "DEU" "ITA" 1997 1994 "EC Treaty" "DEU" "ITA" 1998 1994 "EC Treaty" "DEU" "ITA" 1999 1999 "EU" "DEU" "ITA" 2000 1999 "EU" "DEU" "ITA" 2001 1999 "EU" "DEU" "ITA" 2002 2002 "EU Expansion" end
Comment