Hello,
I currently have an issue with my data. I have a dataset (see below) which includes an Event ID, columns for the year, month and day, a Country variable, a variable counting the number of casualties (nkill) and some other columns (related EH EI EJ).
Some of the entries are "multiple Events" which are connected to some other entry. Hence, the related eventid is indicated in the related column, or if there is more than one related Event, further eventids are indicated in the columns EH, EI, EJ, .....
Now what I would like Stata to do is the following:
1. For each line in the dataset that is part of a multiple Event:
make sure that EACH of the eventids in the related and further columns can be found in the dataset (in the first column eventid) --> if not ALL related Events can be found, the line should be deleted
2. If ALL of the related Events can be found AND the date and Country are identical among ALL events, I would like Stata to sum up all of these events in one line, summing up the numbers for nkill for these related Events and additionally writing the Name of the City where nkill was highest. --> in the end I only want to Keep 1 line for a "multiple Event"
I would be very happy about any Kind of help. I already tried around with fndmtch, ... but couldn't solve my Problem.
Thanks a lot!
I currently have an issue with my data. I have a dataset (see below) which includes an Event ID, columns for the year, month and day, a Country variable, a variable counting the number of casualties (nkill) and some other columns (related EH EI EJ).
Some of the entries are "multiple Events" which are connected to some other entry. Hence, the related eventid is indicated in the related column, or if there is more than one related Event, further eventids are indicated in the columns EH, EI, EJ, .....
Now what I would like Stata to do is the following:
1. For each line in the dataset that is part of a multiple Event:
make sure that EACH of the eventids in the related and further columns can be found in the dataset (in the first column eventid) --> if not ALL related Events can be found, the line should be deleted
2. If ALL of the related Events can be found AND the date and Country are identical among ALL events, I would like Stata to sum up all of these events in one line, summing up the numbers for nkill for these related Events and additionally writing the Name of the City where nkill was highest. --> in the end I only want to Keep 1 line for a "multiple Event"
I would be very happy about any Kind of help. I already tried around with fndmtch, ... but couldn't solve my Problem.
Thanks a lot!
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input double eventid int iyear byte(imonth iday) str32 country_txt double(nkill related EH EI EJ) 197000000001 1970 0 0 "Dominican Republic" 1 . . . . 197000000002 1970 0 0 "Mexico" 0 . . . . 197001000001 1970 1 0 "Philippines" 1 . . . . 197001000002 1970 1 0 "Greece" . . . . . 197001000003 1970 1 0 "Japan" . . . . . 197001010002 1970 1 1 "United States" 0 . . . . 197001020001 1970 1 2 "Uruguay" 0 . . . . 197001020002 1970 1 2 "United States" 0 . . . . 197001020003 1970 1 2 "United States" 0 . . . . 197001030001 1970 1 3 "United States" 0 . . . . 197001050001 1970 1 1 "United States" 0 . . . . 197001060001 1970 1 6 "United States" 0 . . . . 197001080001 1970 1 8 "Italy" 0 . . . . 197001090001 1970 1 9 "United States" 0 . . . . 197001090002 1970 1 9 "United States" 0 . . . . 197001100001 1970 1 10 "East Germany (GDR)" . . . . . 197001110001 1970 1 11 "Ethiopia" 1 . . . . 197001120001 1970 1 12 "United States" 0 . . . . 197001120002 1970 1 12 "United States" 0 . . . . 197001130001 1970 1 13 "United States" 0 . . . . 197001140001 1970 1 14 "United States" 0 . . . . 197001150001 1970 1 15 "Uruguay" 0 . . . . 197001190002 1970 1 19 "United States" 0 . . . . 197001190003 1970 1 19 "United States" 0 . . . . 197001190004 1970 1 19 "United States" 0 . . . . 197001200001 1970 1 20 "Guatemala" 1 . . . . 197001210001 1970 1 21 "Philippines" 0 . . . . 197001220001 1970 1 22 "Venezuela" 0 . . . . 197001220002 1970 1 22 "United States" 0 . . . . 197001250001 1970 1 25 "United States" 0 . . . . 197001250002 1970 1 25 "United States" 0 . . . . 197001260001 1970 1 26 "United States" 0 . . . . 197001260003 1970 1 26 "United States" 0 . . . . 197001270002 1970 1 27 "United States" 0 . . . . 197001280001 1970 1 28 "East Germany (GDR)" . . . . . 197001280002 1970 1 28 "United States" 0 . . . . 197001300001 1970 1 30 "United States" 0 197001300002 . . . 197001300002 1970 1 30 "United States" 0 197001300001 . . . 197001300003 1970 1 30 "United States" 0 . . . . 197001310001 1970 1 31 "Philippines" 0 . . . . 197001310002 1970 1 31 "United States" 0 . . . . 197002010001 1970 2 1 "United States" 0 . . . . 197002010002 1970 2 1 "United States" 0 . . . . 197002020001 1970 2 2 "United States" 0 . . . . 197002030001 1970 2 3 "United States" 0 197002030002 . . . 197002030002 1970 2 3 "United States" 0 197002030001 . . . 197002040001 1970 2 4 "United States" 0 . . . . 197002060001 1970 2 6 "United States" 0 . . . . 197002060002 1970 2 6 "United States" 0 . . . . 197002060003 1970 2 6 "United States" 0 . . . . 197002060004 1970 2 6 "United States" 0 197002060005 197002060006 . . 197002060005 1970 2 6 "United States" 0 197002060004 197002060006 . . 197002060006 1970 2 6 "United States" 0 197002060004 197002060005 . . 197002060007 1970 2 6 "United States" 0 . . . . 197002070003 1970 2 7 "United States" 0 . . . . 197002080001 1970 2 8 "United States" 0 197002080002 197002090003 . . 197002080002 1970 2 8 "United States" 0 197002080001 197002090003 . . 197002090003 1970 2 9 "United States" 0 197002080001 197002080002 . . 197002090004 1970 2 9 "United States" 0 . . . . 197002100001 1970 2 10 "West Germany (FRG)" 1 . . . . 197002110001 1970 2 11 "United States" 0 . . . . 197002130001 1970 2 13 "United States" 0 . . . . 197002130002 1970 2 13 "West Germany (FRG)" 7 . . . . 197002130003 1970 2 13 "United States" 0 . . . . 197002150001 1970 2 15 "United States" 0 . . . . 197002150002 1970 2 15 "United States" 0 . . . . 197002160001 1970 2 16 "United States" 1 197002160004 . . . 197002160002 1970 2 16 "United States" 0 197002160003 . . . 197002160003 1970 2 16 "United States" 0 197002160002 . . . 197002160004 1970 2 16 "United States" 0 197002160001 . . . 197002170001 1970 2 17 "United States" 0 . . . . 197002170002 1970 2 17 "United States" 0 . . . . 197002170003 1970 2 17 "United States" 0 . . . . 197002170004 1970 2 17 "United States" 0 . . . . 197002180002 1970 2 18 "United States" 0 . . . . 197002180003 1970 2 18 "United States" 0 . . . . 197002200001 1970 2 20 "United States" 0 . . . . 197002200002 1970 2 20 "United States" 0 . . . . 197002200003 1970 2 20 "United States" 0 . . . . 197002210001 1970 2 21 "West Germany (FRG)" 0 . . . . 197002210002 1970 2 21 "Switzerland" 47 . . . . 197002210003 1970 2 21 "United States" 0 . . . . 197002210004 1970 2 21 "United States" 0 . . . . 197002210005 1970 2 21 "United States" 0 . . . . 197002210006 1970 2 21 "United States" 0 . . . . 197002220001 1970 2 22 "United States" 0 . . . . 197002220002 1970 2 22 "United States" 0 . . . . 197002230001 1970 2 23 "United States" 0 197002230002 197002230003 197002230004 . 197002230002 1970 2 23 "United States" 0 197002230001 197002230003 197002230004 . 197002230003 1970 2 23 "United States" 0 197002230001 197002230002 197002230004 . 197002230004 1970 2 23 "United States" 0 197002230001 197002230002 197002230003 . 197002230005 1970 2 23 "United States" 0 . . . . 197002240002 1970 2 24 "United States" 0 . . . . 197002240003 1970 2 24 "United States" 0 . . . . 197002270001 1970 2 27 "United States" 0 . . . . 197002280001 1970 2 28 "Jordan" . . . . . 197003000001 1970 3 0 "Philippines" 0 . . . . 197003010001 1970 3 1 "Italy" 0 . . . . 197003010005 1970 3 1 "United States" 0 . . . . 197003010006 1970 3 1 "United States" 0 . . . . end
Comment