Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Bug in nearstat (from SSC)

    This is a continuation of a previous post with a title that may not catch the attention of many people. As I mentioned here, there appears to be a bug in how nearstat (from SSC) computes or selects the nearest neighbor. Here's some additional code that uses the brute force approach (calculate all distances between all points and then pick the nearest) to demonstrate the bug. The variable names and setup continue to match the initial post:

    Code:
    clear
    timer clear
    set obs 1000
    set seed 1234
    
    * generate points around the globe and save
    gen id_11 = _n
    gen double lat_11 = -90 + 180 * uniform()
    gen double long_11 = -180 + 360 * uniform()
    gen id_95 = _n
    gen double lat_95_10 = -90 + 180 * uniform()
    gen double long_95_10 = -180 + 360 * uniform()
    sum
    save "geodata", replace
    
    * nearest neighbor for the first 5 obs using -geonear-
    keep if _n <= 5
    geonear id_11 lat_11 long_11 using "geodata", n(id_95 lat_95_10 long_95_10) long ra(6371.009)
    list
    
    * redo using brute force approach where -cross- is used to form
    * every pairwise combination of points
    use "geodata", clear
    keep id_95 lat_95_10 long_95_10
    save "data95", replace
    use id_11 lat_11 long_11 if _n <= 5 using "geodata", clear
    cross using "data95"
    geodist lat_11 long_11 lat_95_10 long_95_10, gen(d) sphere r(6371.009)
    bysort id_11 (d): keep if _n == 1
    list id_11 id_95 d lat_11 long_11 lat_95_10 long_95_10
    
    * redo using -nearstat-
    use "geodata", clear
    replace lat_11 = . if _n > 5
    nearstat lat_11 long_11, near( lat_95_10 long_95_10 ) distvar(DistPV3m) nid(id_95 closest)
    list id_11 closest DistPV3m in 1/5
    * the coordinates of id_95 == 783 are nowhere near the coordinates of lat_11 == 2
    list lat_11 long_11 if id_11 == 2
    list lat_95_10 long_95_10 if id_95 == 783
Working...
X