Class FilteredClassifier

All Implemented Interfaces:
Serializable, Cloneable, Classifier, IterativeClassifier, BatchPredictor, CapabilitiesHandler, CapabilitiesIgnorer, CommandlineRunnable, Drawable, OptionHandler, PartitionGenerator, Randomizable, RevisionHandler, WeightedAttributesHandler, WeightedInstancesHandler
Direct Known Subclasses:
RandomizableFilteredClassifier

Class for running an arbitrary classifier on data that has been passed through an arbitrary filter. Like the classifier, the structure of the filter is based exclusively on the training data and test instances will be processed by the filter without changing their structure. If unequal instance weights or attribute weights are present, and the filter or the classifier are unable to deal with them, the instances and/or attributes are resampled with replacement based on the weights before they are passed to the filter or the classifier (as appropriate).

Valid options are:

 -F <filter specification>
  Full class name of filter to use, followed
  by filter options.
  default: "weka.filters.supervised.attribute.Discretize -R first-last -precision 6"
 
 -W <classifier name>
  Full name of base classifier.
  (default: weka.classifiers.trees.J48)
 
 -S num
 The random number seed to be used (default 1). 
-doNotCheckForModifiedClassAttribute
If this is set, the classifier will not check whether the filter modifies the class attribute (use with caution).

-output-debug-info
If set, classifier is run in debug mode and may output additional info to the console.

-do-not-check-capabilities
If set, classifier capabilities are not checked before classifier is built (use with caution).

-num-decimal-places
The number of decimal places for the output of numbers in the model.

-batch-size
The desired batch size for batch prediction.

 Options specific to classifier weka.classifiers.trees.J48:
 
 -U
  Use unpruned tree.
 
 -C <pruning confidence>
  Set confidence threshold for pruning.
  (default 0.25)
 
 -M <minimum number of instances>
  Set minimum number of instances per leaf.
  (default 2)
 
 -R
  Use reduced error pruning.
 
 -N <number of folds>
  Set number of folds for reduced error
  pruning. One fold is used as pruning set.
  (default 3)
 
 -B
  Use binary splits only.
 
 -S
  Don't perform subtree raising.
 
 -L
  Do not clean up after the tree has been built.
 
 -A
  Laplace smoothing for predicted probabilities.
 
 -S <seed>
  Seed for random data shuffling (default 1).
 
Version:
$Revision: 15519 $
Author:
Len Trigg (trigg@cs.waikato.ac.nz)
See Also:
  • Constructor Details

    • FilteredClassifier

      public FilteredClassifier()
      Default constructor.
  • Method Details

    • globalInfo

      public String globalInfo()
      Returns a string describing this classifier
      Returns:
      a description of the classifier suitable for displaying in the explorer/experimenter gui
    • graphType

      public int graphType()
      Returns the type of graph this classifier represents.
      Specified by:
      graphType in interface Drawable
      Returns:
      the graph type of this classifier
    • graph

      public String graph() throws Exception
      Returns graph describing the classifier (if possible).
      Specified by:
      graph in interface Drawable
      Returns:
      the graph of the classifier in dotty format
      Throws:
      Exception - if the classifier cannot be graphed
    • generatePartition

      public void generatePartition(Instances data) throws Exception
      Builds the classifier to generate a partition. (If the base classifier supports this.)
      Specified by:
      generatePartition in interface PartitionGenerator
      Throws:
      Exception
    • getMembershipValues

      public double[] getMembershipValues(Instance inst) throws Exception
      Computes an array that has a value for each element in the partition. (If the base classifier supports this.)
      Specified by:
      getMembershipValues in interface PartitionGenerator
      Throws:
      Exception
    • numElements

      public int numElements() throws Exception
      Returns the number of elements in the partition. (If the base classifier supports this.)
      Specified by:
      numElements in interface PartitionGenerator
      Throws:
      Exception
    • initializeClassifier

      public void initializeClassifier(Instances data) throws Exception
      Initializes an iterative classifier. (If the base classifier supports this.)
      Specified by:
      initializeClassifier in interface IterativeClassifier
      Parameters:
      data - the instances to be used in induction
      Throws:
      Exception - if the model cannot be initialized
    • next

      public boolean next() throws Exception
      Performs one iteration. (If the base classifier supports this.)
      Specified by:
      next in interface IterativeClassifier
      Returns:
      false if no further iterations could be performed, true otherwise
      Throws:
      Exception - if this iteration fails for unexpected reasons
    • done

      public void done() throws Exception
      Signal end of iterating, useful for any house-keeping/cleanup (If the base classifier supports this.)
      Specified by:
      done in interface IterativeClassifier
      Throws:
      Exception - if cleanup fails
    • resumeTipText

      public String resumeTipText()
      Tool tip text for finalize property
      Returns:
      the tool tip text for the finalize property
    • setResume

      public void setResume(boolean resume) throws Exception
      If called with argument true, then the next time done() is called the model is effectively "frozen" and no further iterations can be performed
      Specified by:
      setResume in interface IterativeClassifier
      Parameters:
      resume - true if the model is to be finalized after performing iterations
      Throws:
      Exception - if finalization cannot be set
    • getResume

      public boolean getResume()
      Returns true if the model is to be finalized (or has been finalized) after training.
      Specified by:
      getResume in interface IterativeClassifier
      Returns:
      the current value of finalize
    • listOptions

      public Enumeration<Option> listOptions()
      Returns an enumeration describing the available options.
      Specified by:
      listOptions in interface OptionHandler
      Overrides:
      listOptions in class RandomizableSingleClassifierEnhancer
      Returns:
      an enumeration of all the available options.
    • setOptions

      public void setOptions(String[] options) throws Exception
      Parses a given list of options.

      Valid options are:

       -F <filter specification>
        Full class name of filter to use, followed
        by filter options.
        default: "weka.filters.supervised.attribute.Discretize -R first-last -precision 6"
       
       -W <classifier name>
        Full name of base classifier.
        (default: weka.classifiers.trees.J48)
       
       -S num
       The random number seed to be used. 
      -doNotCheckForModifiedClassAttribute
      If this is set, the classifier will not check whether the filter modifies the class attribute (use with caution).

      -output-debug-info
      If set, classifier is run in debug mode and may output additional info to the console.

      -do-not-check-capabilities
      If set, classifier capabilities are not checked before classifier is built (use with caution).

      -num-decimal-laces
      The number of decimal places for the output of numbers in the model.

      -batch-size
      The desired batch size for batch prediction.

       Options specific to classifier weka.classifiers.trees.J48:
       
       -U
        Use unpruned tree.
       
       -C <pruning confidence>
        Set confidence threshold for pruning.
        (default 0.25)
       
       -M <minimum number of instances>
        Set minimum number of instances per leaf.
        (default 2)
       
       -R
        Use reduced error pruning.
       
       -N <number of folds>
        Set number of folds for reduced error
        pruning. One fold is used as pruning set.
        (default 3)
       
       -B
        Use binary splits only.
       
       -S
        Don't perform subtree raising.
       
       -L
        Do not clean up after the tree has been built.
       
       -A
        Laplace smoothing for predicted probabilities.
       
       -Q <seed>
        Seed for random data shuffling (default 1).
       
      Specified by:
      setOptions in interface OptionHandler
      Overrides:
      setOptions in class RandomizableSingleClassifierEnhancer
      Parameters:
      options - the list of options as an array of strings
      Throws:
      Exception - if an option is not supported
    • doNotCheckForModifiedClassAttributeTipText

      public String doNotCheckForModifiedClassAttributeTipText()
      Returns the tip text for this property
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • getDoNotCheckForModifiedClassAttribute

      public boolean getDoNotCheckForModifiedClassAttribute()
      Returns true if classifier checks whether class attribute has been modified by filter.
    • setDoNotCheckForModifiedClassAttribute

      public void setDoNotCheckForModifiedClassAttribute(boolean flag)
      Use this method to determine whether classifier checks whether class attribute has been modified by filter.
    • getOptions

      public String[] getOptions()
      Gets the current settings of the Classifier.
      Specified by:
      getOptions in interface OptionHandler
      Overrides:
      getOptions in class RandomizableSingleClassifierEnhancer
      Returns:
      an array of strings suitable for passing to setOptions
    • filterTipText

      public String filterTipText()
      Returns the tip text for this property
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • setFilter

      public void setFilter(Filter filter)
      Sets the filter
      Parameters:
      filter - the filter with all options set.
    • getFilter

      public Filter getFilter()
      Gets the filter used.
      Returns:
      the filter
    • getCapabilities

      public Capabilities getCapabilities()
      Returns default capabilities of the classifier.
      Specified by:
      getCapabilities in interface CapabilitiesHandler
      Specified by:
      getCapabilities in interface Classifier
      Overrides:
      getCapabilities in class SingleClassifierEnhancer
      Returns:
      the capabilities of this classifier
      See Also:
    • buildClassifier

      public void buildClassifier(Instances data) throws Exception
      Build the classifier on the filtered data.
      Specified by:
      buildClassifier in interface Classifier
      Parameters:
      data - the training data
      Throws:
      Exception - if the classifier could not be built successfully
    • distributionForInstance

      public double[] distributionForInstance(Instance instance) throws Exception
      Classifies a given instance after filtering.
      Specified by:
      distributionForInstance in interface Classifier
      Overrides:
      distributionForInstance in class AbstractClassifier
      Parameters:
      instance - the instance to be classified
      Returns:
      the class distribution for the given instance
      Throws:
      Exception - if instance could not be classified successfully
    • batchSizeTipText

      public String batchSizeTipText()
      Tool tip text for this property
      Overrides:
      batchSizeTipText in class AbstractClassifier
      Returns:
      the tool tip for this property
    • setBatchSize

      public void setBatchSize(String size)
      Set the batch size to use. Gets passed through to the base learner if it implements BatchPredictor. Otherwise it is just ignored.
      Specified by:
      setBatchSize in interface BatchPredictor
      Overrides:
      setBatchSize in class AbstractClassifier
      Parameters:
      size - the batch size to use
    • getBatchSize

      public String getBatchSize()
      Gets the preferred batch size from the base learner if it implements BatchPredictor. Returns 1 as the preferred batch size otherwise.
      Specified by:
      getBatchSize in interface BatchPredictor
      Overrides:
      getBatchSize in class AbstractClassifier
      Returns:
      the batch size to use
    • distributionsForInstances

      public double[][] distributionsForInstances(Instances insts) throws Exception
      Batch scoring method. Calls the appropriate method for the base learner if it implements BatchPredictor. Otherwise it simply calls the distributionForInstance() method repeatedly.
      Specified by:
      distributionsForInstances in interface BatchPredictor
      Overrides:
      distributionsForInstances in class AbstractClassifier
      Parameters:
      insts - the instances to get predictions for
      Returns:
      an array of probability distributions, one for each instance
      Throws:
      Exception - if a problem occurs
    • implementsMoreEfficientBatchPrediction

      public boolean implementsMoreEfficientBatchPrediction()
      Returns true if the base classifier implements BatchPredictor and is able to generate batch predictions efficiently
      Specified by:
      implementsMoreEfficientBatchPrediction in interface BatchPredictor
      Overrides:
      implementsMoreEfficientBatchPrediction in class AbstractClassifier
      Returns:
      true if the base classifier can generate batch predictions efficiently
    • toString

      public String toString()
      Output a representation of this classifier
      Overrides:
      toString in class Object
      Returns:
      a representation of this classifier
    • getRevision

      public String getRevision()
      Returns the revision string.
      Specified by:
      getRevision in interface RevisionHandler
      Overrides:
      getRevision in class AbstractClassifier
      Returns:
      the revision
    • main

      public static void main(String[] argv)
      Main method for testing this class.
      Parameters:
      argv - should contain the following arguments: -t training file [-T test file] [-c class index]