Merging monadic and dyadic dataset

Henry Gilliver

Join Date: Jun 2015

Posts: 9
#1

Merging monadic and dyadic dataset

10 Jun 2015, 12:05

Hi all,

Apologies if this seems trivial, but I can assure you that I've spent two days solidly trying to sort this out independently, with no luck.

I've got a dataset concerning foreign aid flows, with each observation containing a recipient/donor pair, year + other information about the type of aid etc. Each dyad pair appears only once for each year within my temporal range, but individually each recipient and each donor appear in many observations for each year. Below is a visual example.

Year---recipient (str)---recipientcode---donor (str)---donorcode---dyadid------value...
2004---Ghana----------------234------------USA-------------12------------1661-------0.325...
2004---Benin----------------234------------Belgium----------18------------1300-------0.515...
2005---Ghana----------------234------------USA--------------12------------1661------0.250...
2005---Benin-----------------234------------Belgium----------18------------1300------0.850...
2006---Ghana-----------------234------------USA-------------12------------1661-------0.015...
2006---Benin-----------------234------------Belgium----------18------------1300------0.210...

So each observation is uniquely defined by the year, donor and recipient together, as individually (or as a pair) these variables are duplicated many times. I need to merge this dyadic data with a dataset containing monadic information about the recipient countries. These variables include a dummy for whether the recipient sat on the UN Security Council in a given year, quality of governance, number of natural disasters etc. This dataset contains one observation per recipient country per year, and does not contain the donor countries at all. e.g:

Year----------Country------scmember------coruptioncontrol------politicalviolence...
2004----------Ghana-------------0----------------------1.3-----------------------1.8--------...
2004----------Benin--------------1----------------------0.3------------------------1.4-------...
2005----------Ghana-------------0----------------------1.4------------------------1.7-----...
2005----------Benin--------------1----------------------0.25-----------------------1.1------...

Every time I try to merge, I'm told that the variables I've specified don't uniquely identify observations in the master data. What I really want Stata to do is import the monadic information about the recipient for each year, and insert the relevant information into every observation containing the correct recipient/year combination. Is this possible, and what steps do I need to take in order to obtain this result?

Any help will be much appreciated, as I've been stuck on this for a really long time. Thanks
Tags: None
Sergiy Radyakin

Join Date: Apr 2014

Posts: 1867
#2

10 Jun 2015, 12:15

Code:

rename recipient country merge year country using "..second file..."

Every time I try to merge

where is your merge syntax?

Sergiy Radyakin
Comment
Henry Gilliver

Join Date: Jun 2015

Posts: 9
#3

10 Jun 2015, 12:31

Originally posted by Sergiy Radyakin View Post

where is your merge syntax?

Sergiy Radyakin

I hadn't realised merge by itself was an option. I had been trying to use merge 1:m and merge m:1 depending on which way I was attempting to merge the datasets. All the literature I'd consulted online detailed 1:m/m:1 merges, as does the Stata 13 manual. I hadn't considered looking up old syntax. Thanks very much, I can't believe how simple that was compared to how long i've been agonising over not being able to make it work. I'm pretty comfortable using Stata for statistical analysis, but I've never used it to build my own dataset before, hence my unfamiliarity with the command.

thanks again, enormous help.
Comment
William Lisowski

Join Date: Dec 2014

Posts: 10150
#4

10 Jun 2015, 16:03

I believe that, when you were trying the current syntax, you wanted a merge m:1 with the dyadic data in memory and "using" the monadic data from disk (multiple dyadic observations match each monadic observation), of course first using the rename that Sergly identified as necessary. Not sure why this didn't work for you (you said you'd tried it) since we don't see the code from your attempts.
Comment
Henry Gilliver

Join Date: Jun 2015

Posts: 9
#5

11 Jun 2015, 04:58

Well the old syntax sorted out my problem, but in the interest of understanding what was going wrong with the new syntax, as well as helping others in the future who may encounter similar problems:

Code:

use dyadic.dta
sort year recipient
merge m:1 year recipient using governance.dta

(note: variable year was int, now float to accommodate using data's values)
variables year recipient do not uniquely identify observations in the using data
r(459);

Nothing merged, and there's no _merge variable in the master. Both datasets are correctly sorted. Variables year and recipient have the same name in both datasets. The code is red in the Command window. As I say, the old syntax worked well.
Comment
William Lisowski

Join Date: Dec 2014

Posts: 10150
#6

11 Jun 2015, 06:40

The Stata error message

variables year recipient do not uniquely identify observations in the using data

contradicts your assertion that the monadic data

contains one observation per recipient country per year

See help duplicates for tools to assist in dealing with duplicated observations. I would be concerned about the results of your merge using the old syntax.
Comment
Henry Gilliver

Join Date: Jun 2015

Posts: 9
#7

11 Jun 2015, 09:49

Thanks very much. There were five observations tacked onto the end of the dataset with missing values recorded on every variable, which were reported as duplicates. When I dropped those observations and ran m:1 merge it was successful.
Comment

Announcement

Merging monadic and dyadic dataset

Comment

Comment

Comment

Comment

Comment

Comment