Class RemoveMisclassified

java.lang.Object
weka.filters.Filter
weka.filters.unsupervised.instance.RemoveMisclassified
All Implemented Interfaces:
Serializable, CapabilitiesHandler, CapabilitiesIgnorer, CommandlineRunnable, OptionHandler, RevisionHandler, WeightedAttributesHandler, WeightedInstancesHandler, UnsupervisedFilter

public class RemoveMisclassified extends Filter implements UnsupervisedFilter, OptionHandler, WeightedAttributesHandler, WeightedInstancesHandler
A filter that removes instances which are incorrectly classified. Useful for removing outliers.

Valid options are:

 -W <classifier specification>
  Full class name of classifier to use, followed
  by scheme options. eg:
   "weka.classifiers.bayes.NaiveBayes -D"
  (default: weka.classifiers.rules.ZeroR)
 
 -C <class index>
  Attribute on which misclassifications are based.
  If < 0 will use any current set class or default to the last attribute.
 
 -F <number of folds>
  The number of folds to use for cross-validation cleansing.
  (<2 = no cross-validation - default).
 
 -T <threshold>
  Threshold for the max error when predicting numeric class.
  (Value should be >= 0, default = 0.1).
 
 -I
  The maximum number of cleansing iterations to perform.
  (<1 = until fully cleansed - default)
 
 -V
  Invert the match so that correctly classified instances are discarded.
 
Version:
$Revision: 14508 $
Author:
Richard Kirkby (rkirkby@cs.waikato.ac.nz), Malcolm Ware (mfw4@cs.waikato.ac.nz)
See Also:
  • Constructor Details

    • RemoveMisclassified

      public RemoveMisclassified()
  • Method Details

    • getCapabilities

      public Capabilities getCapabilities()
      Returns the Capabilities of this filter.
      Specified by:
      getCapabilities in interface CapabilitiesHandler
      Overrides:
      getCapabilities in class Filter
      Returns:
      the capabilities of this object
      See Also:
    • setInputFormat

      public boolean setInputFormat(Instances instanceInfo) throws Exception
      Sets the format of the input instances.
      Overrides:
      setInputFormat in class Filter
      Parameters:
      instanceInfo - an Instances object containing the input instance structure (any instances contained in the object are ignored - only the structure is required).
      Returns:
      true if the outputFormat may be collected immediately
      Throws:
      Exception - if the inputFormat can't be set successfully
    • input

      public boolean input(Instance instance) throws Exception
      Input an instance for filtering.
      Overrides:
      input in class Filter
      Parameters:
      instance - the input instance
      Returns:
      true if the filtered instance may now be collected with output().
      Throws:
      NullPointerException - if the input format has not been defined.
      Exception - if the input instance was not of the correct format or if there was a problem with the filtering.
    • batchFinished

      public boolean batchFinished() throws Exception
      Signify that this batch of input to the filter is finished.
      Overrides:
      batchFinished in class Filter
      Returns:
      true if there are instances pending output
      Throws:
      IllegalStateException - if no input structure has been defined
      NullPointerException - if no input structure has been defined,
      Exception - if there was a problem finishing the batch.
    • listOptions

      public Enumeration<Option> listOptions()
      Returns an enumeration describing the available options.
      Specified by:
      listOptions in interface OptionHandler
      Overrides:
      listOptions in class Filter
      Returns:
      an enumeration of all the available options.
    • setOptions

      public void setOptions(String[] options) throws Exception
      Parses a given list of options.

      Valid options are:

       -W <classifier specification>
        Full class name of classifier to use, followed
        by scheme options. eg:
         "weka.classifiers.bayes.NaiveBayes -D"
        (default: weka.classifiers.rules.ZeroR)
       
       -C <class index>
        Attribute on which misclassifications are based.
        If < 0 will use any current set class or default to the last attribute.
       
       -F <number of folds>
        The number of folds to use for cross-validation cleansing.
        (<2 = no cross-validation - default).
       
       -T <threshold>
        Threshold for the max error when predicting numeric class.
        (Value should be >= 0, default = 0.1).
       
       -I
        The maximum number of cleansing iterations to perform.
        (<1 = until fully cleansed - default)
       
       -V
        Invert the match so that correctly classified instances are discarded.
       
      Specified by:
      setOptions in interface OptionHandler
      Overrides:
      setOptions in class Filter
      Parameters:
      options - the list of options as an array of strings
      Throws:
      Exception - if an option is not supported
    • getOptions

      public String[] getOptions()
      Gets the current settings of the filter.
      Specified by:
      getOptions in interface OptionHandler
      Overrides:
      getOptions in class Filter
      Returns:
      an array of strings suitable for passing to setOptions
    • globalInfo

      public String globalInfo()
      Returns a string describing this filter
      Returns:
      a description of the filter suitable for displaying in the explorer/experimenter gui
    • classifierTipText

      public String classifierTipText()
      Returns the tip text for this property
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • setClassifier

      public void setClassifier(Classifier classifier)
      Sets the classifier to classify instances with.
      Parameters:
      classifier - The classifier to be used (with its options set).
    • getClassifier

      public Classifier getClassifier()
      Gets the classifier used by the filter.
      Returns:
      The classifier to be used.
    • classIndexTipText

      public String classIndexTipText()
      Returns the tip text for this property
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • setClassIndex

      public void setClassIndex(int classIndex)
      Sets the attribute on which misclassifications are based. If < 0 will use any current set class or default to the last attribute.
      Parameters:
      classIndex - the class index.
    • getClassIndex

      public int getClassIndex()
      Gets the attribute on which misclassifications are based.
      Returns:
      the class index.
    • numFoldsTipText

      public String numFoldsTipText()
      Returns the tip text for this property
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • setNumFolds

      public void setNumFolds(int numOfFolds)
      Sets the number of cross-validation folds to use - < 2 means no cross-validation.
      Parameters:
      numOfFolds - the number of folds.
    • getNumFolds

      public int getNumFolds()
      Gets the number of cross-validation folds used by the filter.
      Returns:
      the number of folds.
    • thresholdTipText

      public String thresholdTipText()
      Returns the tip text for this property
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • setThreshold

      public void setThreshold(double threshold)
      Sets the threshold for the max error when predicting a numeric class. The value should be >= 0.
      Parameters:
      threshold - the numeric theshold.
    • getThreshold

      public double getThreshold()
      Gets the threshold for the max error when predicting a numeric class.
      Returns:
      the numeric threshold.
    • maxIterationsTipText

      public String maxIterationsTipText()
      Returns the tip text for this property
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • setMaxIterations

      public void setMaxIterations(int iterations)
      Sets the maximum number of cleansing iterations to perform - < 1 means go until fully cleansed
      Parameters:
      iterations - the maximum number of iterations.
    • getMaxIterations

      public int getMaxIterations()
      Gets the maximum number of cleansing iterations performed
      Returns:
      the maximum number of iterations.
    • invertTipText

      public String invertTipText()
      Returns the tip text for this property
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • setInvert

      public void setInvert(boolean invert)
      Set whether selection is inverted.
      Parameters:
      invert - whether or not to invert selection.
    • getInvert

      public boolean getInvert()
      Get whether selection is inverted.
      Returns:
      whether or not selection is inverted.
    • getRevision

      public String getRevision()
      Returns the revision string.
      Specified by:
      getRevision in interface RevisionHandler
      Overrides:
      getRevision in class Filter
      Returns:
      the revision
    • main

      public static void main(String[] argv)
      Main method for testing this class.
      Parameters:
      argv - should contain arguments to the filter: use -h for help