Class CoverTree

All Implemented Interfaces:
Serializable, AdditionalMeasureProducer, OptionHandler, RevisionHandler, TechnicalInformationHandler

public class CoverTree extends NearestNeighbourSearch implements TechnicalInformationHandler
Class implementing the CoverTree datastructure.
The class is very much a translation of the c source code made available by the authors.

For more information and original source code see:

Alina Beygelzimer, Sham Kakade, John Langford: Cover trees for nearest neighbor. In: ICML'06: Proceedings of the 23rd international conference on Machine learning, New York, NY, USA, 97-104, 2006.

BibTeX:

 @inproceedings{Beygelzimer2006,
    address = {New York, NY, USA},
    author = {Alina Beygelzimer and Sham Kakade and John Langford},
    booktitle = {ICML'06: Proceedings of the 23rd international conference on Machine learning},
    pages = {97-104},
    publisher = {ACM Press},
    title = {Cover trees for nearest neighbor},
    year = {2006},
    location = {Pittsburgh, Pennsylvania},
    HTTP = {http://hunch.net/\~jl/projects/cover_tree/cover_tree.html}
 }
 

Valid options are:

 -B <value>
  Set base of the expansion constant
  (default = 1.3).
 
Version:
$Revision: 10203 $
Author:
Alina Beygelzimer (original C++ code), Sham Kakade (original C++ code), John Langford (original C++ code), Ashraf M. Kibriya (amk14[at-the-rate]cs[dot]waikato[dot]ac[dot]nz) (Java port)
See Also:
  • Constructor Details

    • CoverTree

      public CoverTree()
      default constructor.
  • Method Details

    • globalInfo

      public String globalInfo()
      Returns a string describing this nearest neighbour search algorithm.
      Overrides:
      globalInfo in class NearestNeighbourSearch
      Returns:
      a description of the algorithm for displaying in the explorer/experimenter gui
    • getTechnicalInformation

      public TechnicalInformation getTechnicalInformation()
      Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.
      Specified by:
      getTechnicalInformation in interface TechnicalInformationHandler
      Returns:
      the technical information about this class
    • listOptions

      public Enumeration<Option> listOptions()
      Returns an enumeration describing the available options.
      Specified by:
      listOptions in interface OptionHandler
      Overrides:
      listOptions in class NearestNeighbourSearch
      Returns:
      an enumeration of all the available options.
    • setOptions

      public void setOptions(String[] options) throws Exception
      Parses a given list of options.

      Valid options are:

       -B <value>
        Set base of the expansion constant
        (default = 1.3).
       
      Specified by:
      setOptions in interface OptionHandler
      Overrides:
      setOptions in class NearestNeighbourSearch
      Parameters:
      options - the list of options as an array of strings
      Throws:
      Exception - if an option is not supported
    • getOptions

      public String[] getOptions()
      Gets the current settings of KDtree.
      Specified by:
      getOptions in interface OptionHandler
      Overrides:
      getOptions in class NearestNeighbourSearch
      Returns:
      an array of strings suitable for passing to setOptions
    • kNearestNeighbours

      public Instances kNearestNeighbours(Instance target, int k) throws Exception
      Returns k-NNs of a given target instance, from among the previously supplied training instances (supplied through setInstances method) P.S.: May return more than k-NNs if more one instances have the same distance to the target as the kth NN.
      Specified by:
      kNearestNeighbours in class NearestNeighbourSearch
      Parameters:
      target - The instance for which k-NNs are required.
      k - The number of k-NNs to find.
      Returns:
      The k-NN instances of the given target instance.
      Throws:
      Exception - If there is some problem find the k-NNs.
    • nearestNeighbour

      public Instance nearestNeighbour(Instance target) throws Exception
      Returns the NN instance of a given target instance, from among the previously supplied training instances.
      Specified by:
      nearestNeighbour in class NearestNeighbourSearch
      Parameters:
      target - The instance for which NN is required.
      Returns:
      The NN instance of the target instance.
      Throws:
      Exception - If there is some problem finding the nearest neighbour.
    • getDistances

      public double[] getDistances() throws Exception
      Returns the distances of the (k)-NN(s) found earlier by kNearestNeighbours()/nearestNeighbour().
      Specified by:
      getDistances in class NearestNeighbourSearch
      Returns:
      The distances (in the same order) of the k-NNs.
      Throws:
      Exception - If the tree hasn't been built (by calling setInstances()), or none of kNearestNeighbours() or nearestNeighbour() has been called before.
    • setInstances

      public void setInstances(Instances instances) throws Exception
      Builds the Cover Tree on the given set of instances.
      Overrides:
      setInstances in class NearestNeighbourSearch
      Parameters:
      instances - The insts on which the Cover Tree is to be built.
      Throws:
      Exception - If some error occurs while building the Cover Tree
    • update

      public void update(Instance ins) throws Exception
      Adds an instance to the cover tree. P.S.: The current version doesn't allow addition of instances after batch construction.
      Specified by:
      update in class NearestNeighbourSearch
      Parameters:
      ins - The instance to add.
      Throws:
      Exception - Alway throws this, as current implementation doesn't allow addition of instances after building.
    • addInstanceInfo

      public void addInstanceInfo(Instance ins)
      Adds the given instance info. This implementation updates only the range datastructures of the EuclideanDistance. Nothing is required to be updated in the built Cover Tree.
      Overrides:
      addInstanceInfo in class NearestNeighbourSearch
      Parameters:
      ins - The instance to add the information of. Usually this is the test instance supplied to update the range of attributes in the distance function.
    • setDistanceFunction

      public void setDistanceFunction(DistanceFunction df) throws Exception
      Sets the distance function to use for nearest neighbour search. Currently only EuclideanDistance is supported.
      Overrides:
      setDistanceFunction in class NearestNeighbourSearch
      Parameters:
      df - the distance function to use
      Throws:
      Exception - if not EuclideanDistance
    • baseTipText

      public String baseTipText()
      Returns the tip text for this property.
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • getBase

      public double getBase()
      Returns the base in use for expansion constant.
      Returns:
      base currently in use.
    • setBase

      public void setBase(double b)
      Sets the base to use for expansion constant. The 2 in 2^i in the paper.
      Parameters:
      b - the new base;
    • measureTreeSize

      public double measureTreeSize()
      Returns the size of the tree. (number of internal nodes + number of leaves)
      Returns:
      the size of the tree
    • measureNumLeaves

      public double measureNumLeaves()
      Returns the number of leaves.
      Returns:
      the number of leaves
    • measureMaxDepth

      public double measureMaxDepth()
      Returns the depth of the tree.
      Returns:
      the number of rules
    • enumerateMeasures

      public Enumeration<String> enumerateMeasures()
      Returns an enumeration of the additional measure names.
      Specified by:
      enumerateMeasures in interface AdditionalMeasureProducer
      Overrides:
      enumerateMeasures in class NearestNeighbourSearch
      Returns:
      an enumeration of the measure names
    • getMeasure

      public double getMeasure(String additionalMeasureName)
      Returns the value of the named measure.
      Specified by:
      getMeasure in interface AdditionalMeasureProducer
      Overrides:
      getMeasure in class NearestNeighbourSearch
      Parameters:
      additionalMeasureName - the name of the measure to query for its value
      Returns:
      the value of the named measure
      Throws:
      IllegalArgumentException - if the named measure is not supported
    • getRevision

      public String getRevision()
      Returns the revision string.
      Specified by:
      getRevision in interface RevisionHandler
      Returns:
      the revision
    • main

      public static void main(String[] args)
      Method for testing the class from command line.
      Parameters:
      args - The supplied command line arguments.