Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Distance Between Two ZIP/Postal Codes

    All,

    I'm trying to find something that'll give me the distance (preferably miles, but anything is fine and I can convert to miles after-the-fact) between two zip codes. I've got a filed that looks like this, basically

    Column 1: Observation zip code; unique to each observation
    Column 2: A constant zip code; same value for all observations in the file.

    What I'd *like* is a command that gives me Column 3: Distance between the ZIP codes referenced in Column 1 and Column 2.

    Did a bit of searching and came up with the vincenty command, but that looks like it runs off of lat/lon coordinates. Which I suppose is fine; I can just translate my ZIP codes into lat/long coordinates based off a separate file and THEN run vincenty.

    But if someone could help me with the code that'd actually do the calculation of column 3, I'd very much appreciate it. It's been years since I've had to do much more than out-of-the-box stuff in STATA and I'm rustier than I care to admit.

    Thanks.

  • #2
    You easily find lat and lon for zip codes. One example file here. You would just have to merge that in.

    Another route would be to use one of the community-contributed geocoding commands, like here. Some of these commands have stopped working after Google/MapQuest/etc. changed their API, so try to find a recent one.

    Comment


    • #3
      I would actually say this is not a simple problem as presented. A zip code is a polygon so the distance between the two polygons wouldn’t be easy to define as there would be an infinite number of points within each zip code and an infinite number of points within the second zip code that could yield a difference. Are you wanting the difference between the centroids of the polygons (e.g., difference between two points), the distance between the closest points between the two polygons (e.g., some difference constrained to the boundaries of each of the polygons), the distance between the furthest two points of each polygon, or something different?

      Comment


      • #4
        In general, this sort of question arises when ZIP codes are being used as readily-available proxies for the actual locations - of a consumer and a retailer, for example. And in such cases, it is common to use the distance between ZIP code centroids (for which latitude and longitude are readily available) as a proxy for the actual distance between the two locations. Keeping in mind that straight-line distance is itself an approximation for "distance by road" or "driving time" or "transit time". "Close enough for marketing" is the applicable qualifier here, which is why "find the nearest retailer" online engines often want to send me to stores on the other side of the Delaware River from my residence.

        Comment


        • #5
          I think this actually even worse in theory, as ZIP codes are not actually polygons, but merely lists of addresses designed to make mail delivery more efficient: https://gis.stackexchange.com/questi...ode-boundaries.

          You can sometimes put a convex hull around them and get a polygon, but this is often not possible.

          Using distance between population-weighted centroids might be a reasonable approximation for many purposes.

          Comment

          Working...
          X