Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Help merging with duplicate variables on master dataset

    Hi people!

    I read so many posts and watched videos but I cannot find a solutions to a problem I thought it would be simpler. I need to use merge so it could work as a vlooukup in excel. so But I keep getting the message "variable ISO_YEAR does not uniquely identify observations in the master data"

    My master dataset has duplicate data for some variables
    COUNTRY1 YEAR1 GOOD1
    COUNTRY1 YEAR1 GOOD2
    COUNTRY1 YEAR1 GOOD3
    COUNTRY1 YEAR2 GOOD1
    COUNTRY1 YEAR2 GOOD2
    COUNTRY1 YEAR2 GOOD3
    COUNTRY2 YEAR1 GOOD1
    COUNTRY2 YEAR1 GOOD2
    COUNTRY2 YEAR1 GOOD3
    COUNTRY2 YEAR2 GOOD1
    COUNTRY2 YEAR2 GOOD2
    COUNTRY2 YEAR2 GOOD3
    etc

    I need to add another variable "INCOME" from a different dataset. The income for each country for each year. The other dataset is like the following:
    COUNTRY1 YEAR1 INCOME_c1y1
    COUNTRY1 YEAR2 INCOME_c1y2
    ...
    COUNTRY2 YEAR1 INCOME_c2y1
    COUNTRY2 YEAR2 INCOME_c2y2
    ... ...

    obviously I was expecting the variable to be repeated in the master data at the end. I would need something like this in my master dataset

    COUNTRY1 YEAR1 GOOD1 INCOME_c1y1
    COUNTRY1 YEAR1 GOOD2 INCOME_c1y1
    COUNTRY1 YEAR1 GOOD3 INCOME_c1y1
    COUNTRY1 YEAR2 GOOD1 INCOME_c1y2
    COUNTRY1 YEAR2 GOOD2 INCOME_c1y2
    COUNTRY1 YEAR2 GOOD3 INCOME_c1y2
    COUNTRY2 YEAR1 GOOD1 INCOME_c2y1
    COUNTRY2 YEAR1 GOOD2 INCOME_c2y1
    COUNTRY2 YEAR1 GOOD3 INCOME_c2y1
    COUNTRY2 YEAR2 GOOD1 INCOME_c2y2
    COUNTRY2 YEAR2 GOOD2 INCOME_c2y2
    COUNTRY2 YEAR2 GOOD3 INCOME_c2y2

    I even concatenate the variables COUNTRY and YEAR on two datasets, so they would be ID
    c1_y1
    c1_y2
    c2_y1
    c2_y2
    ...

    But as there are duplicates on my master dataset, I cannot complete this.

    I cannot delete duplicates because I need the information of goods exported.

    Is there any possible way to achieve this?

    Thank you in advance!

    sh
    Last edited by silvana huanqui; 30 May 2022, 04:35.

  • #2
    did you try merge m:1?

    Comment


    • #3
      Yes, I've tried 1:1, m:1, 1:m. Everything, same error.

      Comment


      • #4
        Those are both dataset...


        Attached Files

        Comment


        • #5
          If merge tells you that ISO_YEAR does not uniquely identify values in your income dataset, then you need to use the duplicates command locate the duplicates - that you are unaware of - in your income dataset. Start by reviewing the output of
          Code:
          help duplicates
          and then consider the following
          Code:
          use income_dataseet
          duplicates report ISO_YEAR
          duplicates examples ISO_YEAR
          duplicates tag ISO_YEAR, generate(copies)
          sort ISO_YEAR
          browse if copies>0
          Last edited by William Lisowski; 30 May 2022, 09:37.

          Comment


          • #6
            Thanks to everyone, I had to use dmerge, its like the previous merge from a version several years ago.

            use Percapita_GDP.dta
            sort ISO_YEAR
            save Percapita_GDP.dta, replace
            use 1997_2021.dta
            dmerge ISO_YEAR using Percapita_GDP.dta

            But it worked...

            Comment

            Working...
            X