No announcement yet.
  • Filter
  • Time
  • Show
Clear All
new posts

  • Mata implementation of a fast (k) nearest neighbours lookup algorithm

    Hello everyone

    I have implemented a kd-tree search algorithm in Mata, that can find the k nearest neighbours of a p-dimensional point among a set of points. For large data sets, this can be much faster than a 'brute force' search, and it could be useful for researchers doing spatial analysis.

    The code is available from my Github repository. Simply download and run the file; this will intialize all the Mata functions. Example usage:
    version 15.1
    mata: mata clear
    mata: mata set matastrict on
        N = 10000
        k = 5
        query_coords = runiform(N,2)
        data_coords = runiform(N,2)
        knn(query_coords, data_coords, k, kni=., knd=.)
    The matrices kni and knd contain the indices of, and distances to the k nearest points, for each query point. Of course, the query and the data points could be the same in which case the first nearest neighbour is always 'self'. Duplicate data_coords are not allowed, and will throw an error.

    I have only thoroughly tested it with 2-dimensional points yet. If you feel that this is useful, or if you find any bugs, kindly let me know! I also consider uploading it to the SSC archive, but have not found the time to do so yet.

    Last edited by Robert Aue; 31 Jan 2020, 04:32.