Hello!
I am attempting to merge two data sets by county. One data set is on libraries, the other on unemployment. I am doing a 1:1 merge since the county names are the unique identifiers. Even though the county names are written the same, the merge is going wrong. I am not getting an error message, but only one variable, Washington county, is merging, and the rest aren't. It shows up as all the library data from the original data set with missing values for the unemployment variables, then the one observation that actually worked (Washington county with both library and unemployment variables populated), then all the data from the using set with missing values for the library variables. In other words, obs for every county's library data, Washington county with both library and unemployment, then every county AGAIN, this time with just unemployment data. I don't know why it is doing this. The county variable is a string, which I am less familiar with, could that be part of the problem? But the case and spelling are identical, so I don't understand why the merge is failing in this way.
Here's what I've got, copied from stata:
. clear
. use "C:\Users\persi\Downloads\MetodsTESTKEEPJUSTAL .dta "
. merge 1:1 cnty using "C:\Users\persi\Downloads\MetodsTESTKEEPJUSTALunem p.dta"
(variable cnty was str20, now str27 to accommodate using data's values)
Result Number of obs
-----------------------------------------
Not matched 132
from master 66 (_merge==1)
from using 66 (_merge==2)
Matched 1 (_merge==3)
Any ideas as to what this might be about?
I am attempting to merge two data sets by county. One data set is on libraries, the other on unemployment. I am doing a 1:1 merge since the county names are the unique identifiers. Even though the county names are written the same, the merge is going wrong. I am not getting an error message, but only one variable, Washington county, is merging, and the rest aren't. It shows up as all the library data from the original data set with missing values for the unemployment variables, then the one observation that actually worked (Washington county with both library and unemployment variables populated), then all the data from the using set with missing values for the library variables. In other words, obs for every county's library data, Washington county with both library and unemployment, then every county AGAIN, this time with just unemployment data. I don't know why it is doing this. The county variable is a string, which I am less familiar with, could that be part of the problem? But the case and spelling are identical, so I don't understand why the merge is failing in this way.
Here's what I've got, copied from stata:
. clear
. use "C:\Users\persi\Downloads\MetodsTESTKEEPJUSTAL .dta "
. merge 1:1 cnty using "C:\Users\persi\Downloads\MetodsTESTKEEPJUSTALunem p.dta"
(variable cnty was str20, now str27 to accommodate using data's values)
Result Number of obs
-----------------------------------------
Not matched 132
from master 66 (_merge==1)
from using 66 (_merge==2)
Matched 1 (_merge==3)
Any ideas as to what this might be about?
Comment