Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Two datasets are not merging

    Hello,
    I am very new to Stata and facing a problem with merging. I have two datasets, each containing variables CountryName and CountryCode. The first datasets also contains a variable with info on CO2 emission, the second one contains a variable with info on GDP. I need to to merge them, and tried to do it by CountryName, or CountryCode, or both using one-to-one, many-to-one, one-to-many but neither of these worked. The error: variable CountryName does not uniquely identify observations in the master data r(459); variables CountryName CountryCode do not uniquely identify observations in the using data r(459); but they should be unique. Could you please help me with this issue?
    Last edited by Maria Karimova; 20 Oct 2023, 06:25.

  • #2
    Let's call them "data 1" and "data 2".

    Open data 1, and run the followings:

    Code:
    duplicates report CountryName CountryCode
    and post the results Stata returned to you.

    Open date 2, and run the followings again:

    Code:
    duplicates report CountryName CountryCode
    and post the results as well.

    With that we may be able to tell if merge 1:1, m:1, or 1:m is suitable. The choice should not be determined through trial and error. If the IDs in a data set are supposed to be unique, it's "1", if they are repeated (e.g. each country has 10 rows, from years 2014-2023) then it's "m". You'd need to know the data collection scheme and study design.

    In addition, make sure you do not have missing values (in Stata they are stored as "." for number and a blank for string) in your variables. Repeated missing values are consider repeated ID and they will prevent successful merges.

    And last but not least, welcome to Statalist. Please take a moment to read the FAQ here (link at the top of the page) and understand how to ask questions that are easy to answer. For example, provide sample data using dataex command, and provide codes that are used, etc.
    Last edited by Ken Chui; 20 Oct 2023, 06:49.

    Comment


    • #3
      Assuming this is World Bank data and you have reshaped to a long layout, then you need to merge using both CountryName and year.

      Code:
      merge 1:1 CountryName year using ...
      Otherwise, provide a sample of the dataset, e.g., by copying and pasting the result of

      Code:
      dataex in 1/30

      Comment


      • #4
        Ken, thanks a lot! I got rid of duplicates and mising values, and the datasets successfully merged. Also a big thank you for pointing out when to use 1:1, m:1, or 1:m. Really appreciate your help!

        Comment

        Working...
        X