Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Merging more than two datasets in Stata

    Hello: I am a beginner in Stata, and currently working with NHANES data in Stata, my question is can you combine more than two data sets on stata? I tried to use the merge command and the "combine data" tab but seems to merge only two data sets.

    any help would be appreciated.
    Thanks
    Hadeel

  • #2
    Hi Hadeel,

    The 'merge' command should be used when the two databases have the same variables. If this is the case of its three or more databases, you can use the merge more than once. In the first you group databases "A" and "B" generating a database "C". Then you group data base "C" to the third data base (D), generating a new database, and so on.
    If, your databases do not have the same variables you should use the 'appending' command, following the same reasoning above to unite their three or more databases.

    kind regards

    Girlan Oliveira

    Comment


    • #3
      Hadeel,

      Let me make a small correction, the 'appending' command is that is used when the databases have the same variables and not the 'merge' command.

      Comment


      • #4
        Hi Girlan:

        Thank you for your response. Yes I think the merge is the command that you use when you have different variables as you merge on a certain "variable". Still I can't seem to merge more than two datasets.
        In the NHANES tutorial the command is simply "merge varlist using filename [, options]" and this would merge multiple datasets however whenever I enter I get an error message that this is an old syntax.

        Thanks
        Hadeel

        Comment


        • #5
          It is old syntax. Perhaps the NHANES tutorial goes back some years? I'm not familiar with it.

          Anyway, before you do any merges you need to know what the merge key variable(s) is(are), and whether they uniquely identify observations in the data in memory, and also in the using data set. So if the key variables (varlist) uniquely identify the observations in both data sets it's

          Code:
          merge 1:1 varlist using filename [, options]
          If, say the varlist variables uniquely identify the observations in the data in memory, but not in the using data set, then it's:

          Code:
          merge 1:m varlist using filename [, options]
          Similarly, if varlist uniquely identifies observations in the using data set, but not the data in memory, it's

          Code:
          merge m:1 varlist using filename [, options]
          If the variables in varlist don't uniquely identify the observations in either data set, then you probably shouldn't be using -merge- at all. It is greater than 99.99999% likely, in that case, that there is an error in the data, or you are misunderstanding what you are trying to do with the data sets, or you should be using some other command. There is such a thing as -merge m:m- but it is almost never the correct thing to do.

          Have a look at the manual section on -merge-. There are a lot of options that have been added since the 1:1/1:m/m:1 syntax was added, and they can be very useful--some of them might be helpful to you, too.

          Comment


          • #6
            Hello, I am also trying to merge 3 data-sets. I have successfully tried to merge 2 of the data-sets but I am having trouble merging the third data-set. The only merge command that works is merge m:m and the use of joinby syntax yields no observation. Help!

            Comment


            • #7
              Nobody can possibly help you without example data from the three data sets. Use the -dataex- command to do this. If you are running version 15.1 or a fully updated version 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

              -merge m:m- just produces data salad. Don't use it. If it appears to be the only possibility for the -merge- it means either that your data sets are not -merge-able or you don't understand the structure of your data and are overlooking the right key for using -merge 1:m- or -merge m:1-.

              It is quite difficult for me to imagine how -joinby- could result in no observations. So in addition to showing example data, please show the code you tried.

              Comment

              Working...
              X