Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • merging data

    HI EVERYONE;
    i have a problem with emerging some dataset files
    when i run the command
    HTML Code:
    . merge m:1 code using "E:\data\DATA\all_data\regional_data.dta"
    i get the following error message
    HTML Code:
    variable code does not uniquely identify observations in the using data
    when i check my data there is no duplication and i run the command describe
    the results came out as follow(master file)
    HTML Code:
                  storage   display    value
    variable name   type    format     label      variable label
    --------------------------------------------------------------------------------------------------------------------
    code            double  %12.0g                code
    year            double  %12.0g                year
    as for merged file the results as follows:
    HTML Code:
    --------------------------------------------------------------------------------------------------------------------
                  storage   display    value
    variable name   type    format     label      variable label
    --------------------------------------------------------------------------------------------------------------------
    code            double  %12.0g                code
    please someone help solving such an issue

  • #2
    Hi,
    The "duplication" problem refers to the values within the variable "code".
    If you open your using data and do the following, you will see if there are indeed duplicates:
    Code:
    use "E:\data\DATA\all_data\regional_data.dta", clear
    bysort code:gen N=_N
    sum N
    If there were no duplicates on this dataset, N should only have ones. If that is not the case, you have to revise your data and see why are there multiple observations with the same value "code" in that file.
    HTH

    Comment


    • #3
      Originally posted by FernandoRios View Post
      Hi,
      The "duplication" problem refers to the values within the variable "code".
      If you open your using data and do the following, you will see if there are indeed duplicates:
      Code:
      use "E:\data\DATA\all_data\regional_data.dta", clear
      bysort code:gen N=_N
      sum N
      If there were no duplicates on this dataset, N should only have ones. If that is not the case, you have to revise your data and see why are there multiple observations with the same value "code" in that file.
      HTH
      thank you for replying
      actually i did what u suggested and you are right the maximum is 2.
      . thanks
      Last edited by ALKEBSEE RADWAN; 06 Oct 2019, 06:36.

      Comment

      Working...
      X