Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Combine two data sets

    Hello,

    I am an undergraduate working on a survey. I would like to combine two data sets, which have the same variable names. I have used merge, and append, but my created variables (adding two of them together) don't account for the new observations. Is there a way for me to do this without rewriting everything?

  • #2
    You are rather vague in what you did and what you want. Sounds like the two datasets are separate cases (say, for example, you gave the same survey to two classes). In which case, append is the way to go. But then you really confuse me with "(adding two of them together)" and such. If you can show us an example of what you have vs. what you want, it will be easier to help.

    Comment


    • #3
      I second everything Ben says, and I would add: don't post your examples as attachments or screen shots. Use the -list- command to get representative data from both data sets, and then post those in a code block. (Click on the underlined A button to open the advanced editor, then on the # button. A pair of code-block delimiters will appear. Paste your examples between them.) This is the only way to assure that the data you show us will be readable to the rest of us.

      Comment


      • #4
        Currently I am unable to do that, but what I mean is in the first data set, I've generated a new variable, which was the row total of two different variables. I want the composite variable to include the observations from the from the added data sets. I've played around and done some more research and I don't think it is possible. Thank you anyway!

        Comment


        • #5
          If I understand you right you have a dataset, data1.dta with the variables, for example, a1 and a2, and you added them, creating a1_2:
          Code:
          generate a1_2 = a1+a2
          Now you want to append a second dataset, data2.dta, which also has the variables a1 and a2, but not a1_2. If I am right, do this:
          Code:
          use data1.dta
          append using data2.dta
          replace a1_2 = a1+a2 if missing(a1_2)
          save data3.dta

          Comment

          Working...
          X