Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Not merging any observations

    Hi All,

    I usually can solve my questions with existing threats from the forum, however I seem to be stuck now.


    I want to merge the Masterfile including ZIP codes with another one with single ZIP codes and their latitude and longitude.

    - merge m:1 ZIP using H:\usingfile.dta

    It never matches a single observation, also not with 1:1, m:m or 1:m.

    I can't spot the problem and can't find any similar problems in the threats, any idea?


    I have ZIP from the Master file as:

    storage display value
    variable name type format label variable label
    --------------------------------------------------------------------------------
    ZIP str10 %10s


    and the file I want to use with:

    storage display value
    variable name type format label variable label
    --------------------------------------------------------------------------------
    ZIP str7 %9s PCD


  • #2
    if the zip's in the master file are really 10 characters long, they will never match anything in the using file as that has, apparently, a maximum of 7 characters; possibly, the using file zip is a subset of the master file zip? better answers might appear if you show short example data from each (please use -dataex- as explained in the FAQ for this forum)

    Comment


    • #3
      Thanks for the reply Rich, as I was writing this I found the mistake.

      There was somehow a space as first character in the Masterfile; as shown below.

      Thanks!

      Masterfile:
      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input str10 postcode
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      " W2 3AN"
      end

      Usingfile:
      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input str7 postcode
      "AB1 7FA"
      "AB1 7FG"
      "AB1 7FH"
      "AB1 7FJ"
      "AB1 7FS"
      "AB1 7FT"
      "AB1 7GA"
      "AB1 7GB"
      "AB1 7GD"
      "AB1 7GH"
      "AB1 7GZ"
      "AB1 7HA"
      "AB1 7HB"
      "AB1 7HD"
      "AB1 7HE"
      "AB1 7HF"
      "AB1 7HG"
      "AB1 7HH"
      "AB1 7HJ"
      "AB1 7HL"
      "AB1 7HN"
      "AB1 7HP"
      "AB1 7HQ"
      "AB1 7HR"
      "AB1 7HS"
      "AB1 7HT"
      "AB1 7HU"
      "AB1 7HW"
      "AB1 7HX"
      "AB1 7HY"
      "AB1 7HZ"
      "AB1 7JA"
      "AB1 7JB"
      "AB1 7JD"
      "AB1 7JE"
      "AB1 7JF"
      "AB1 7JG"
      "AB1 7JH"
      "AB1 7JJ"
      "AB1 7JL"
      "AB1 7JN"
      "AB1 7JP"
      "AB1 7JQ"
      "AB1 7JR"
      "AB1 7JS"
      "AB1 7JT"
      "AB1 7JU"
      "AB1 7JW"
      "AB1 7JX"
      "AB1 7JY"
      "AB1 7JZ"
      "AB1 7LA"
      "AB1 7LB"
      "AB1 7LD"
      "AB1 7LE"
      "AB1 7LF"
      "AB1 7LG"
      "AB1 7LH"
      "AB1 7LJ"
      "AB1 7LL"
      "AB1 7LN"
      "AB1 7LP"
      "AB1 7LQ"
      "AB1 7LR"
      "AB1 7LS"
      "AB1 7LT"
      "AB1 7LU"
      "AB1 7LW"
      "AB1 7LX"
      "AB1 7LY"
      "AB1 7LZ"
      "AB1 7NA"
      "AB1 7NB"
      "AB1 7ND"
      "AB1 7NE"
      "AB1 7NF"
      "AB1 7NG"
      "AB1 7NH"
      "AB1 7NJ"
      "AB1 7NL"
      "AB1 7NN"
      "AB1 7NP"
      "AB1 7NQ"
      "AB1 7NR"
      "AB1 7NS"
      "AB1 7NT"
      "AB1 7NU"
      "AB1 7NW"
      "AB1 7NX"
      "AB1 7NY"
      "AB1 7NZ"
      "AB1 7PA"
      "AB1 7PB"
      "AB1 7PD"
      "AB1 7PE"
      "AB1 7PF"
      "AB1 7PG"
      "AB1 7PH"
      "AB1 7PJ"
      "AB1 7PL"
      end
      Thanks!

      Comment


      • #4
        This is where it can help to become familiar with Stata's string functions, which manipulate strings. Parallel functions exist in SAS, Excel, R, probably any program you can name. In your current case, you can use -strtrim- to trim all leading and trailing blanks:

        Code:
        gen postcode_trim = strtrim(postcode)
        It won't trim internal blanks. Those look like Canadian postcodes, and I don't know the etiquette for storage (e.g. ICD-9 and -10 codes designating diseases have decimal points, but the etiquette is to always remove them for computer storage, and no leading zeroes or blanks either). If you need to trim multiple internal blanks to one blank (e.g. your software was inconsistent about how many blanks to insert), you could use -stritrim-. There are a bunch more interesting functions to be aware of.
        Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

        When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.

        Comment


        • #5
          These are obviously not zip codes. I question whether they're Canadian postal codes either (some are only 5 characters instead of 6).

          Ng's comment is absolutely right - you need to make sure that the strings are identical. What you sent us is not helpful - it doesn't include observations that should match between the two sets. There is no way a bunch of 7 character strings with spaces in the middle starting with A should match to the observations in the first file (7 character starting with space and then W).



          Comment


          • #6
            A little wandering the interwebs suggests that these are UK postcodes, which are ... interesting.

            https://en.wikipedia.org/wiki/Postco...United_Kingdom

            Comment

            Working...
            X