Class CfsSubsetEval

java.lang.Object
weka.attributeSelection.ASEvaluation
weka.attributeSelection.CfsSubsetEval
All Implemented Interfaces:
Serializable, SubsetEvaluator, CapabilitiesHandler, CapabilitiesIgnorer, CommandlineRunnable, OptionHandler, RevisionHandler, TechnicalInformationHandler, ThreadSafe

public class CfsSubsetEval extends ASEvaluation implements SubsetEvaluator, ThreadSafe, OptionHandler, TechnicalInformationHandler
CfsSubsetEval :

Evaluates the worth of a subset of attributes by considering the individual predictive ability of each feature along with the degree of redundancy between them.

Subsets of features that are highly correlated with the class while having low intercorrelation are preferred.

For more information see:

M. A. Hall (1998). Correlation-based Feature Subset Selection for Machine Learning. Hamilton, New Zealand.

BibTeX:

 @phdthesis{Hall1998,
    address = {Hamilton, New Zealand},
    author = {M. A. Hall},
    school = {University of Waikato},
    title = {Correlation-based Feature Subset Selection for Machine Learning},
    year = {1998}
 }
 

Valid options are:

 -M
  Treat missing values as a separate value.
 
 -L
  Don't include locally predictive attributes.
 
 -Z
  Precompute the full correlation matrix at the outset, rather than compute correlations lazily (as needed) during the search. Use this in conjuction with parallel processing in order to speed up a backward search.
 
 -P <int>
  The size of the thread pool, for example, the number of cores in the CPU. (default 1)
 
 -E <int>
  The number of threads to use, which should be >= size of thread pool. (default 1)
 
 -D
  Output debugging info.
 
Version:
$Revision: 15519 $
Author:
Mark Hall (mhall@cs.waikato.ac.nz)
See Also:
  • Constructor Details

    • CfsSubsetEval

      public CfsSubsetEval()
      Constructor
  • Method Details

    • globalInfo

      public String globalInfo()
      Returns a string describing this attribute evaluator
      Returns:
      a description of the evaluator suitable for displaying in the explorer/experimenter gui
    • getTechnicalInformation

      public TechnicalInformation getTechnicalInformation()
      Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.
      Specified by:
      getTechnicalInformation in interface TechnicalInformationHandler
      Returns:
      the technical information about this class
    • listOptions

      public Enumeration<Option> listOptions()
      Returns an enumeration describing the available options.
      Specified by:
      listOptions in interface OptionHandler
      Overrides:
      listOptions in class ASEvaluation
      Returns:
      an enumeration of all the available options.
    • setOptions

      public void setOptions(String[] options) throws Exception
      Parses and sets a given list of options.

      Valid options are:

       -M
        Treat missing values as a separate value.
       
       -L
        Don't include locally predictive attributes.
       
       -Z
        Precompute the full correlation matrix at the outset, rather than compute correlations lazily (as needed) during the search. Use this in conjuction with parallel processing in order to speed up a backward search.
       
       -P <int>
        The size of the thread pool, for example, the number of cores in the CPU. (default 1)
       
       -E <int>
        The number of threads to use, which should be >= size of thread pool. (default 1)
       
       -D
        Output debugging info.
       
      Specified by:
      setOptions in interface OptionHandler
      Overrides:
      setOptions in class ASEvaluation
      Parameters:
      options - the list of options as an array of strings
      Throws:
      Exception - if an option is not supported
    • preComputeCorrelationMatrixTipText

      public String preComputeCorrelationMatrixTipText()
      Returns:
      a string to describe the option
    • setPreComputeCorrelationMatrix

      public void setPreComputeCorrelationMatrix(boolean p)
      Set whether to pre-compute the full correlation matrix at the outset, rather than computing individual correlations lazily (as needed) during the search.
      Parameters:
      p - true if the correlation matrix is to be pre-computed at the outset
    • getPreComputeCorrelationMatrix

      public boolean getPreComputeCorrelationMatrix()
      Get whether to pre-compute the full correlation matrix at the outset, rather than computing individual correlations lazily (as needed) during the search.
      Returns:
      true if the correlation matrix is to be pre-computed at the outset
    • numThreadsTipText

      public String numThreadsTipText()
      Returns:
      a string to describe the option
    • getNumThreads

      public int getNumThreads()
      Gets the number of threads.
    • setNumThreads

      public void setNumThreads(int nT)
      Sets the number of threads
    • poolSizeTipText

      public String poolSizeTipText()
      Returns:
      a string to describe the option
    • getPoolSize

      public int getPoolSize()
      Gets the number of threads.
    • setPoolSize

      public void setPoolSize(int nT)
      Sets the number of threads
    • locallyPredictiveTipText

      public String locallyPredictiveTipText()
      Returns the tip text for this property
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • setLocallyPredictive

      public void setLocallyPredictive(boolean b)
      Include locally predictive attributes
      Parameters:
      b - true or false
    • getLocallyPredictive

      public boolean getLocallyPredictive()
      Return true if including locally predictive attributes
      Returns:
      true if locally predictive attributes are to be used
    • missingSeparateTipText

      public String missingSeparateTipText()
      Returns the tip text for this property
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • setMissingSeparate

      public void setMissingSeparate(boolean b)
      Treat missing as a separate value
      Parameters:
      b - true or false
    • getMissingSeparate

      public boolean getMissingSeparate()
      Return true is missing is treated as a separate value
      Returns:
      true if missing is to be treated as a separate value
    • setDebug

      public void setDebug(boolean d)
      Set whether to output debugging info
      Parameters:
      d - true if debugging info is to be output
    • getDebug

      public boolean getDebug()
      Set whether to output debugging info
      Returns:
      true if debugging info is to be output
    • debugTipText

      public String debugTipText()
      Returns the tip text for this property
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • getOptions

      public String[] getOptions()
      Gets the current settings of CfsSubsetEval
      Specified by:
      getOptions in interface OptionHandler
      Overrides:
      getOptions in class ASEvaluation
      Returns:
      an array of strings suitable for passing to setOptions()
    • getCapabilities

      public Capabilities getCapabilities()
      Returns the capabilities of this evaluator.
      Specified by:
      getCapabilities in interface CapabilitiesHandler
      Overrides:
      getCapabilities in class ASEvaluation
      Returns:
      the capabilities of this evaluator
      See Also:
    • buildEvaluator

      public void buildEvaluator(Instances data) throws Exception
      Generates a attribute evaluator. Has to initialize all fields of the evaluator that are not being set via options. CFS also discretises attributes (if necessary) and initializes the correlation matrix.
      Specified by:
      buildEvaluator in class ASEvaluation
      Parameters:
      data - set of instances serving as training data
      Throws:
      Exception - if the evaluator has not been generated successfully
    • evaluateSubset

      public double evaluateSubset(BitSet subset) throws Exception
      evaluates a subset of attributes
      Specified by:
      evaluateSubset in interface SubsetEvaluator
      Parameters:
      subset - a bitset representing the attribute subset to be evaluated
      Returns:
      the merit
      Throws:
      Exception - if the subset could not be evaluated
    • toString

      public String toString()
      returns a string describing CFS
      Overrides:
      toString in class Object
      Returns:
      the description as a string
    • postProcess

      public int[] postProcess(int[] attributeSet) throws Exception
      Calls locallyPredictive in order to include locally predictive attributes (if requested).
      Overrides:
      postProcess in class ASEvaluation
      Parameters:
      attributeSet - the set of attributes found by the search
      Returns:
      a possibly ranked list of postprocessed attributes
      Throws:
      Exception - if postprocessing fails for some reason
    • clean

      public void clean()
      Description copied from class: ASEvaluation
      Tells the evaluator that the attribute selection process is complete. It can then clean up data structures, references to training data as necessary in order to save memory
      Overrides:
      clean in class ASEvaluation
    • getRevision

      public String getRevision()
      Returns the revision string.
      Specified by:
      getRevision in interface RevisionHandler
      Overrides:
      getRevision in class ASEvaluation
      Returns:
      the revision
    • main

      public static void main(String[] args)
      Main method for testing this class.
      Parameters:
      args - the options