Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Merge failure despite having unique and matching identifiers

    Hello!

    I am attempting to merge two data sets by county. One data set is on libraries, the other on unemployment. I am doing a 1:1 merge since the county names are the unique identifiers. Even though the county names are written the same, the merge is going wrong. I am not getting an error message, but only one variable, Washington county, is merging, and the rest aren't. It shows up as all the library data from the original data set with missing values for the unemployment variables, then the one observation that actually worked (Washington county with both library and unemployment variables populated), then all the data from the using set with missing values for the library variables. In other words, obs for every county's library data, Washington county with both library and unemployment, then every county AGAIN, this time with just unemployment data. I don't know why it is doing this. The county variable is a string, which I am less familiar with, could that be part of the problem? But the case and spelling are identical, so I don't understand why the merge is failing in this way.

    Here's what I've got, copied from stata:

    . clear

    . use "C:\Users\persi\Downloads\MetodsTESTKEEPJUSTAL .dta "

    . merge 1:1 cnty using "C:\Users\persi\Downloads\MetodsTESTKEEPJUSTALunem p.dta"
    (variable cnty was str20, now str27 to accommodate using data's values)

    Result Number of obs
    -----------------------------------------
    Not matched 132
    from master 66 (_merge==1)
    from using 66 (_merge==2)

    Matched 1 (_merge==3)


    Any ideas as to what this might be about?

  • #2
    Originally posted by Rin Kilde View Post
    Any ideas as to what this might be about?
    I think that your first clue is the note that you got from merge: "variable cnty was str20, now str27 to accommodate using data's values".

    The text data might look similar to you, but they obviously don't to Stata. Try something along the following lines.
    Code:
    use "C:\Users\persi\Downloads\MetodsTESTKEEPJUSTALunem p.dta"
    generate str county = strlower(ustrtrim(cnty))
    tempfile unemp
    quietly save `unemp'
    
    use "C:\Users\persi\Downloads\MetodsTESTKEEPJUSTAL .dta "
    generate str county = strlower(ustrtrim(cnty))
    merge 1:1 county using `unemp'

    Comment


    • #3
      Originally posted by Joseph Coveney View Post
      I think that your first clue is the note that you got from merge: "variable cnty was str20, now str27 to accommodate using data's values".
      Thank you so much!! This worked like a charm!

      Comment

      Working...
      X