Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Merging dataset

    Hi,

    I want to ask a question on merging dataset. My Master dataset looks like an example given below. It has many countries with code year and country and some variables.


    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input byte Code int Year str12 Country str1(Var1 Var2)
    6 2000 "Afghanistan " "." "."
    6 2002 "Afghanistan " "." "."
    6 2005 "Afghanistan " "." "."
    6 2006 "Afghanistan " "." "."
    6 2007 "Afghanistan " "." "."
    6 2009 "Afghanistan " "." "."
    9 2001 "China"        "." "."
    9 2008 "China"        "." "."
    9 2012 "China"        "." "."
    9 2015 "China"        "." "."
    end


    In my second data file (which i want to merge) contains only Code and Country (identicial in both datasets) and some list of variables. The variable against the specific code and country will remain the same through out all the years in master data set. This is how my second data file looks like.


    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input byte Code str11 Country str1(Var3 Var4 Var5)
    6 "Afghanistan" "." "." "."
    9 "China "      "." "." "."
    end


    I shall be really thankful.












  • #2
    You need to remove trailing spaces using trim()

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input byte Code int Year str12 Country str1(Var1 Var2)
    6 2000 "Afghanistan " "." "."
    6 2002 "Afghanistan " "." "."
    6 2005 "Afghanistan " "." "."
    6 2006 "Afghanistan " "." "."
    6 2007 "Afghanistan " "." "."
    6 2009 "Afghanistan " "." "."
    9 2001 "China"        "." "."
    9 2008 "China"        "." "."
    9 2012 "China"        "." "."
    9 2015 "China"        "." "."
    end
    
    replace Country = trim(Country)
    tempfile temp
    save `temp'
    
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input byte Code str11 Country str1(Var3 Var4 Var5)
    6 "Afghanistan" "." "." "."
    9 "China "      "." "." "."
    end
    
    replace Country = trim(Country)
    merge 1:m Country using `temp'
    
    . merge 1:m Country using `temp'
    (note: variable Country was str11, now str12 to accommodate using data's values)
    
        Result                           # of obs.
        -----------------------------------------
        not matched                             0
        matched                                10  (_merge==3)
        -----------------------------------------

    Comment


    • #3
      Hi Justin,

      Many thanks for your suggestion. However it it still giving this error.

      variable Country does not uniquely identify observations in the master data
      r(459);

      Comment


      • #4
        You'll need to give me more to work with -- the code works with what you've provided as shown above. Either that or see the helpfile for joinby. In your second data example you have two observations, one for each country. So I assumed you have a unique identifier in your country variable, which is why I suggested a 1:m merge (alternatively m:1 depending on which dataset is loaded first). Your error code leads me to believe that's not the case -- you have multiple entries for Country in each dataset judging by your error. Assuming there are no duplicates:

        Code:
        help joinby

        Comment


        • #5
          Yeah I have checked for duplicates. It says there are none. Many thanks for your help.

          Comment

          Working...
          X