Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Comparing two datasets by two variables

    Hi, I am very new to stata. I have x1, y1, z variables in data1.dta and x2, y2, N in data2.dta.

    I am trying to run an analysis where:
    1. Step 1: x1 and x2 will be matched (merge) first.
    2. Step 2: Within the matched result y1 and y2 will be matched (merge).
    3. Expected result will be the data where y1 and y2 have finally matched (merge) and I'll get to see the z and N where x1=x2 within which y1=y2.
    Constraints are:
    Data within x1, y1, x2, y2 aren't unique which is why I can't merge or append the datasets.

    I was hoping to run this process in a loop.

    Thanks in advance.

  • #2
    I'm not sure I understand what you're trying to do with this process. But I think the result you are looking for, if I have grasped it, can be achieved by a simpler route:

    Code:
    use data_set1 // THE ONE CONTAIING x1, y1, and z
    rename(x1 y1) (x2 y2)
    joinby x2 y2 using dataset_2 // THE ONE WITH x2, y2, and N
    rename (x2 y2) (x y)
    Note: You provided no example data, so this code is untested. Beware of typos or other errors.

    Comment


    • #3
      Hi Clyde, thanks for the response. I am trying to merge these two datasets where there is no unique identifier. dataset_1 has 230 observation & dataset_2 has 96 observation. I tried 'joinby' and the problem is only 75 observation of dataset_2 gets merged. My goal is to get one new dataset where all the information of dataset_1 corresponding to x2 and y2 of dataset_2 will be merged INCLUDING the ones which couldn't be exactly matched.

      Comment


      • #4
        My goal is to get one new dataset where all the information of dataset_1 corresponding to x2 and y2 of dataset_2 will be merged INCLUDING the ones which couldn't be exactly matched.
        I don't understand. The sentence contradicts itself. I think that trying to explain what you want in words is unlikely to succeed. Show us brief examples from both data sets, using the -dataex- command, and then show what you want the result to look like for those examples.

        If you are running version 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

        Comment

        Working...
        X