Merging more than two datasets in Stata

hadeel anan

Join Date: Apr 2016

Posts: 2
#1

Merging more than two datasets in Stata

12 Apr 2016, 11:03

Hello: I am a beginner in Stata, and currently working with NHANES data in Stata, my question is can you combine more than two data sets on stata? I tried to use the merge command and the "combine data" tab but seems to merge only two data sets.

any help would be appreciated.
Thanks
Hadeel
Tags: None
Girlan Oliveira

Join Date: Feb 2016

Posts: 99
#2

12 Apr 2016, 11:27

Hi Hadeel,

The 'merge' command should be used when the two databases have the same variables. If this is the case of its three or more databases, you can use the merge more than once. In the first you group databases "A" and "B" generating a database "C". Then you group data base "C" to the third data base (D), generating a new database, and so on.
If, your databases do not have the same variables you should use the 'appending' command, following the same reasoning above to unite their three or more databases.

kind regards

Girlan Oliveira
Comment
Girlan Oliveira

Join Date: Feb 2016

Posts: 99
#3

12 Apr 2016, 11:47

Hadeel,

Let me make a small correction, the 'appending' command is that is used when the databases have the same variables and not the 'merge' command.
Comment
hadeel anan

Join Date: Apr 2016

Posts: 2
#4

12 Apr 2016, 12:19

Hi Girlan:

Thank you for your response. Yes I think the merge is the command that you use when you have different variables as you merge on a certain "variable". Still I can't seem to merge more than two datasets.
In the NHANES tutorial the command is simply "merge varlist using filename [, options]" and this would merge multiple datasets however whenever I enter I get an error message that this is an old syntax.

Thanks
Hadeel
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30117
#5

12 Apr 2016, 12:31

It is old syntax. Perhaps the NHANES tutorial goes back some years? I'm not familiar with it.

Anyway, before you do any merges you need to know what the merge key variable(s) is(are), and whether they uniquely identify observations in the data in memory, and also in the using data set. So if the key variables (varlist) uniquely identify the observations in both data sets it's

Code:

merge 1:1 varlist using filename [, options]

If, say the varlist variables uniquely identify the observations in the data in memory, but not in the using data set, then it's:

Code:

merge 1:m varlist using filename [, options]

Similarly, if varlist uniquely identifies observations in the using data set, but not the data in memory, it's

Code:

merge m:1 varlist using filename [, options]

If the variables in varlist don't uniquely identify the observations in either data set, then you probably shouldn't be using -merge- at all. It is greater than 99.99999% likely, in that case, that there is an error in the data, or you are misunderstanding what you are trying to do with the data sets, or you should be using some other command. There is such a thing as -merge m:m- but it is almost never the correct thing to do.

Have a look at the manual section on -merge-. There are a lot of options that have been added since the 1:1/1:m/m:1 syntax was added, and they can be very useful--some of them might be helpful to you, too.
1 like
Comment
MARIA MORGAN

Join Date: Apr 2019

Posts: 1
#6

19 Apr 2019, 20:36

Hello, I am also trying to merge 3 data-sets. I have successfully tried to merge 2 of the data-sets but I am having trouble merging the third data-set. The only merge command that works is merge m:m and the use of joinby syntax yields no observation. Help!
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30117
#7

19 Apr 2019, 21:11

Nobody can possibly help you without example data from the three data sets. Use the -dataex- command to do this. If you are running version 15.1 or a fully updated version 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

-merge m:m- just produces data salad. Don't use it. If it appears to be the only possibility for the -merge- it means either that your data sets are not -merge-able or you don't understand the structure of your data and are overlooking the right key for using -merge 1:m- or -merge m:1-.

It is quite difficult for me to imagine how -joinby- could result in no observations. So in addition to showing example data, please show the code you tried.
1 like
Comment

Announcement

Merging more than two datasets in Stata

Comment

Comment

Comment

Comment

Comment

Comment