Merge failure despite having unique and matching identifiers

Rin Kilde

Join Date: May 2023

Posts: 2
#1

Merge failure despite having unique and matching identifiers

07 May 2023, 20:57

Hello!

I am attempting to merge two data sets by county. One data set is on libraries, the other on unemployment. I am doing a 1:1 merge since the county names are the unique identifiers. Even though the county names are written the same, the merge is going wrong. I am not getting an error message, but only one variable, Washington county, is merging, and the rest aren't. It shows up as all the library data from the original data set with missing values for the unemployment variables, then the one observation that actually worked (Washington county with both library and unemployment variables populated), then all the data from the using set with missing values for the library variables. In other words, obs for every county's library data, Washington county with both library and unemployment, then every county AGAIN, this time with just unemployment data. I don't know why it is doing this. The county variable is a string, which I am less familiar with, could that be part of the problem? But the case and spelling are identical, so I don't understand why the merge is failing in this way.

Here's what I've got, copied from stata:

. clear

. use "C:\Users\persi\Downloads\MetodsTESTKEEPJUSTAL .dta "

. merge 1:1 cnty using "C:\Users\persi\Downloads\MetodsTESTKEEPJUSTALunem p.dta"
(variable cnty was str20, now str27 to accommodate using data's values)

Result Number of obs
-----------------------------------------
Not matched 132
from master 66 (_merge==1)
from using 66 (_merge==2)

Matched 1 (_merge==3)

Any ideas as to what this might be about?
Tags: 1:1 merge, merge, merging
Joseph Coveney

Join Date: Apr 2014

Posts: 4410
#2

07 May 2023, 21:40

Originally posted by Rin Kilde View Post

Any ideas as to what this might be about?

I think that your first clue is the note that you got from merge: "variable cnty was str20, now str27 to accommodate using data's values".

The text data might look similar to you, but they obviously don't to Stata. Try something along the following lines.

Code:

use "C:\Users\persi\Downloads\MetodsTESTKEEPJUSTALunem p.dta" generate str county = strlower(ustrtrim(cnty)) tempfile unemp quietly save `unemp' use "C:\Users\persi\Downloads\MetodsTESTKEEPJUSTAL .dta " generate str county = strlower(ustrtrim(cnty)) merge 1:1 county using `unemp'
2 likes
Comment
Rin Kilde

Join Date: May 2023

Posts: 2
#3

08 May 2023, 01:13

Originally posted by Joseph Coveney View Post

I think that your first clue is the note that you got from merge: "variable cnty was str20, now str27 to accommodate using data's values".

Thank you so much!! This worked like a charm!
Comment

Announcement

Merge failure despite having unique and matching identifiers

Comment

Comment