Class KDTree

java.lang.Object
org.tribuo.math.neighbour.kdtree.KDTree
All Implemented Interfaces:
NeighboursQuery

public final class KDTree extends Object implements NeighboursQuery
A k-d tree nearest neighbour query implementation.

For the original algorithm see:

 J.L. Bentley "Multidimensional Binary Search Trees Used for Associative Searching", Commun. ACM, Vol 18, Sept. 1975, 509–517
 Multidimensional Binary Search Trees Used for Associative Searching
 

Some aspects of this implementation, particularly the logic which finds the median and partitions the points for each dimension, are derived from:

 G. Heineman, G. Pollice, and S. Selkow. "Algorithms in a Nutshell" O'Reilly Media, Inc. 2008.
 Algorithms in a Nutshell
 
  • Method Details

    • query

      public List<com.oracle.labs.mlrg.olcut.util.Pair<Integer,Double>> query(SGDVector point, int k)
      Description copied from interface: NeighboursQuery
      Queries a set of SGDVectors to determine the k points nearest to the provided point. When there are multiple points equidistant from the provided point, the order in which they are returned may vary depending on the implementation.
      Specified by:
      query in interface NeighboursQuery
      Parameters:
      point - The point to determine the nearest k points for.
      k - The number of neighbouring points to identify.
      Returns:
      A list of k Pairs, where a pair contains the index of the neighbouring point in the original data and the distance between this point and the provided point.
    • query

      public List<List<com.oracle.labs.mlrg.olcut.util.Pair<Integer,Double>>> query(SGDVector[] points, int k)
      Description copied from interface: NeighboursQuery
      Queries a set of SGDVectors to determine the k points nearest to the provided points. When there are multiple points equidistant from a provided point, the order in which they are returned may vary depending on the implementation.
      Specified by:
      query in interface NeighboursQuery
      Parameters:
      points - An array of points to determine the nearest k points for.
      k - The number of neighbouring points to identify.
      Returns:
      An list containing lists of k Pairs. There is list entry for each provided point which is a list of k pairs. Each pair contains the index of the neighbouring point in the original data and the distance between this point and the provided point.
    • queryAll

      public List<List<com.oracle.labs.mlrg.olcut.util.Pair<Integer,Double>>> queryAll(int k)
      Description copied from interface: NeighboursQuery
      Queries a set of SGDVectors to determine the k points nearest to every point in the set. When there are multiple points equidistant from a point in the set, the order in which they are returned may vary depending on the implementation.
      Specified by:
      queryAll in interface NeighboursQuery
      Parameters:
      k - The number of neighbouring points to identify.
      Returns:
      A list containing lists of k Pairs. There is list entry for each provided point which is a list of k pairs. Each pair contains the index of the neighbouring point in the original data and the distance between this point and the provided point.
    • approximateParentNode

      public org.tribuo.math.neighbour.kdtree.DimensionNode approximateParentNode(SGDVector point)
      This makes a fast approximation of the provided point's parent node, if it were being inserted into the tree.
      Parameters:
      point - The target point.
      Returns:
      The approximate parent node.