Package weka.core

Class EuclideanDistance

All Implemented Interfaces:
Serializable, Cloneable, DistanceFunction, OptionHandler, RevisionHandler, TechnicalInformationHandler

public class EuclideanDistance extends NormalizableDistance implements Cloneable, TechnicalInformationHandler
Implementing Euclidean distance (or similarity) function.

One object defines not one distance but the data model in which the distances between objects of that data model can be computed.

Attention: For efficiency reasons the use of consistency checks (like are the data models of the two instances exactly the same), is low.

For more information, see:

Wikipedia. Euclidean distance. URL http://en.wikipedia.org/wiki/Euclidean_distance.

BibTeX:

 @misc{missing_id,
    author = {Wikipedia},
    title = {Euclidean distance},
    URL = {http://en.wikipedia.org/wiki/Euclidean_distance}
 }
 

Valid options are:

 -D
  Turns off the normalization of attribute 
  values in distance calculation.
 -R <col1,col2-col4,...>
  Specifies list of columns to used in the calculation of the 
  distance. 'first' and 'last' are valid indices.
  (default: first-last)
 -V
  Invert matching sense of column indices.
Version:
$Revision: 8034 $
Author:
Gabi Schmidberger (gabi@cs.waikato.ac.nz), Ashraf M. Kibriya (amk14@cs.waikato.ac.nz), FracPete (fracpete at waikato dot ac dot nz)
See Also:
  • Constructor Details

    • EuclideanDistance

      public EuclideanDistance()
      Constructs an Euclidean Distance object, Instances must be still set.
    • EuclideanDistance

      public EuclideanDistance(Instances data)
      Constructs an Euclidean Distance object and automatically initializes the ranges.
      Parameters:
      data - the instances the distance function should work on
  • Method Details

    • globalInfo

      public String globalInfo()
      Returns a string describing this object.
      Specified by:
      globalInfo in class NormalizableDistance
      Returns:
      a description of the evaluator suitable for displaying in the explorer/experimenter gui
    • getTechnicalInformation

      public TechnicalInformation getTechnicalInformation()
      Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.
      Specified by:
      getTechnicalInformation in interface TechnicalInformationHandler
      Returns:
      the technical information about this class
    • distance

      public double distance(Instance first, Instance second)
      Calculates the distance between two instances.
      Specified by:
      distance in interface DistanceFunction
      Overrides:
      distance in class NormalizableDistance
      Parameters:
      first - the first instance
      second - the second instance
      Returns:
      the distance between the two given instances
    • distance

      public double distance(Instance first, Instance second, PerformanceStats stats)
      Calculates the distance (or similarity) between two instances. Need to pass this returned distance later on to postprocess method to set it on correct scale.
      P.S.: Please don't mix the use of this function with distance(Instance first, Instance second), as that already does post processing. Please consider passing Double.POSITIVE_INFINITY as the cutOffValue to this function and then later on do the post processing on all the distances.
      Specified by:
      distance in interface DistanceFunction
      Overrides:
      distance in class NormalizableDistance
      Parameters:
      first - the first instance
      second - the second instance
      stats - the structure for storing performance statistics.
      Returns:
      the distance between the two given instances or Double.POSITIVE_INFINITY.
    • postProcessDistances

      public void postProcessDistances(double[] distances)
      Does post processing of the distances (if necessary) returned by distance(distance(Instance first, Instance second, double cutOffValue). It is necessary to do so to get the correct distances if distance(distance(Instance first, Instance second, double cutOffValue) is used. This is because that function actually returns the squared distance to avoid inaccuracies arising from floating point comparison.
      Specified by:
      postProcessDistances in interface DistanceFunction
      Overrides:
      postProcessDistances in class NormalizableDistance
      Parameters:
      distances - the distances to post-process
    • sqDifference

      public double sqDifference(int index, double val1, double val2)
      Returns the squared difference of two values of an attribute.
      Parameters:
      index - the attribute index
      val1 - the first value
      val2 - the second value
      Returns:
      the squared difference
    • getMiddle

      public double getMiddle(double[] ranges)
      Returns value in the middle of the two parameter values.
      Parameters:
      ranges - the ranges to this dimension
      Returns:
      the middle value
    • closestPoint

      public int closestPoint(Instance instance, Instances allPoints, int[] pointList) throws Exception
      Returns the index of the closest point to the current instance. Index is index in Instances object that is the second parameter.
      Parameters:
      instance - the instance to assign a cluster to
      allPoints - all points
      pointList - the list of points
      Returns:
      the index of the closest point
      Throws:
      Exception - if something goes wrong
    • valueIsSmallerEqual

      public boolean valueIsSmallerEqual(Instance instance, int dim, double value)
      Returns true if the value of the given dimension is smaller or equal the value to be compared with.
      Parameters:
      instance - the instance where the value should be taken of
      dim - the dimension of the value
      value - the value to compare with
      Returns:
      true if value of instance is smaller or equal value
    • getRevision

      public String getRevision()
      Returns the revision string.
      Specified by:
      getRevision in interface RevisionHandler
      Returns:
      the revision