Cannot find paper: All k nearest neighbors search in N*log(N) using distance indices for log(N) support points

https://cs.stackexchange.com/questions/83021

nearest-neighbour

04-11-2019
|

题

I remember reading a paper about finding k nearest neighbors for all N multi-dimensional objects in the set.

I've tries to find it again many times, but have failed so far.

The algorithm was as follows:

Each object is assigned a sorted queue of nears neighbors (maximum size is k). Let's assume that $k < log(N)$
Each time you calculate $distance(A, B)$ try to add A into B's neighbor list and vise versa.
Choose $log(N)$ support objects (say, first $log(N)$ objects out of $N$). (Or maybe that was $sqrt(N)$)
For each support, build the distance-based index containing all N objects
After the index construction, every object has a candidate list of k nearest neighbors. All real nearest neighbors are not further than the furthest neighbor in the candidate list - candidate neighbor radius.
For each point we use triangle inequality and current neighbor radius to constrain the distances between a candidate point and the support points: $|dist(candidate_i, support_j) - dist(point, support_j)| <= radius$. We efficiently extract the candidates that fall in radius-based ranges for all support points.
We calculate the distances for those candidates and update the nearest-neighbor lists for both the current point and the candidate points.
As we process more points, the neighbor lists become "tighter" - the radius, the distance to the furthest neighbor decreases with each update of the list.
Tighter constraints mean that the candidate lists become smaller and smaller after each processed point and leading to total run time of $N*log(N)$.

So, why do I need the paper again? First of all, several people have asked me for it and I was surprised I cannot find it again. More importantly, it did not work well for me. This might be caused by a bad choice of the support objects. I remember nothing about the selection process.

没有正确的解决方案

许可以下： CC-BY-SA 和归因

不隶属于 cs.stackexchange