This is a continuation of a previous post with a title that may not catch the attention of many people. As I mentioned here, there appears to be a bug in how nearstat (from SSC) computes or selects the nearest neighbor. Here's some additional code that uses the brute force approach (calculate all distances between all points and then pick the nearest) to demonstrate the bug. The variable names and setup continue to match the initial post:
Code:
clear timer clear set obs 1000 set seed 1234 * generate points around the globe and save gen id_11 = _n gen double lat_11 = -90 + 180 * uniform() gen double long_11 = -180 + 360 * uniform() gen id_95 = _n gen double lat_95_10 = -90 + 180 * uniform() gen double long_95_10 = -180 + 360 * uniform() sum save "geodata", replace * nearest neighbor for the first 5 obs using -geonear- keep if _n <= 5 geonear id_11 lat_11 long_11 using "geodata", n(id_95 lat_95_10 long_95_10) long ra(6371.009) list * redo using brute force approach where -cross- is used to form * every pairwise combination of points use "geodata", clear keep id_95 lat_95_10 long_95_10 save "data95", replace use id_11 lat_11 long_11 if _n <= 5 using "geodata", clear cross using "data95" geodist lat_11 long_11 lat_95_10 long_95_10, gen(d) sphere r(6371.009) bysort id_11 (d): keep if _n == 1 list id_11 id_95 d lat_11 long_11 lat_95_10 long_95_10 * redo using -nearstat- use "geodata", clear replace lat_11 = . if _n > 5 nearstat lat_11 long_11, near( lat_95_10 long_95_10 ) distvar(DistPV3m) nid(id_95 closest) list id_11 closest DistPV3m in 1/5 * the coordinates of id_95 == 783 are nowhere near the coordinates of lat_11 == 2 list lat_11 long_11 if id_11 == 2 list lat_95_10 long_95_10 if id_95 == 783