Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Why non missing conflict?

    Hello,
    I'm saving a database in CSV format, then merging it again in.
    It generates a nonmissing conflict (I would have expected a "not updated" instead).
    Can I know why?
    I have isolated both the variable and the observation.
    It's importnat for a project I'm working on

    Code:
    // save CSV
    webuse nlsw88, clear
    keep if idcode == 4555
    keep wage idcode
    export delimited "master.csv", quote
    // convert csv 2 dta
    clear all
    import delimited "master.csv"
    save "master.dta", replace
    // merge
    webuse nlsw88, clear
    keep if idcode == 4555
    keep wage idcode
    merge 1:1 idcode using "master.dta", update replace

  • #2
    because you included the "replace" option which, according to the help file, "replace[s] all values of same-named variables in master with nonmissing values from using"

    Comment


    • #3
      and including that option should result in "not updated", not in "nonmissing conflict", because the value didn't change.

      Consider the following example:
      Code:
      // save CSV
      webuse nlsw88, clear
      keep if idcode == 4555 | idcode==1
      keep wage idcode
      export delimited "master.csv", quote
      
      // convert csv 2 dta
      clear all
      import delimited "master.csv"
      save "master.dta", replace
      
      // merge
      
      webuse nlsw88, clear
      keep if idcode == 4555 | idcode==1
      keep wage idcode
      merge 1:1 idcode using "master.dta", update replace
      Why with this code I have 1 "not updated" and 1 "nonmissing conflict"? Why idcode==1 is also a nonmissing conflict?

      Comment


      • #4
        Because wage is a non-integer number, is it (generally) not possible to represent the fractional part exactly as text, because, say, 0.1 is a neverending fraction in binary/octal/hexadecimal. And so when you read the csv data back in, it may have subtle differences from the original data. Consider the following modified version of your example.
        Code:
        // save CSV
        webuse nlsw88, clear
        keep if idcode == 4555 | idcode==1
        keep wage idcode
        export delimited "master.csv", quote replace
        
        // convert csv 2 dta
        clear all
        import delimited "master.csv"
        rename wage wage2
        save "master.dta", replace
        
        // merge
        
        webuse nlsw88, clear
        keep if idcode == 4555 | idcode==1
        keep wage idcode
        merge 1:1 idcode using "master.dta"
        
        generate diff = wage-wage2
        list, clean
        Code:
        . list, clean
        
               idcode       wage      wage2        _merge       diff  
          1.        1   11.73913   11.73913   Matched (3)          0  
          2.     4555   14.13042   14.13042   Matched (3)   9.54e-07

        Comment

        Working...
        X