Class SubstringLabelerRules

java.lang.Object
weka.gui.beans.SubstringLabelerRules
All Implemented Interfaces:
Serializable, EnvironmentHandler

public class SubstringLabelerRules extends Object implements EnvironmentHandler, Serializable
Manages a list of match rules for labeling strings. Also has methods for determining the output structure with respect to a set of rules and for constructing output instances that have been labeled according to the rules.
Version:
$Revision: 12232 $
Author:
Mark Hall (mhall{[at]}pentaho{[dot]}com)
See Also:
  • Field Details

    • MATCH_RULE_SEPARATOR

      public static final String MATCH_RULE_SEPARATOR
      Separator for match rules in the internal representation
      See Also:
  • Constructor Details

    • SubstringLabelerRules

      public SubstringLabelerRules(String matchDetails, String newAttName, boolean consumeNonMatching, boolean nominalBinary, Instances inputStructure, String statusMessagePrefix, Logger log, Environment env) throws Exception
      Constructor
      Parameters:
      matchDetails - the internally encoded match details string
      newAttName - the name of the new attribute that will be the label
      consumeNonMatching - true if non-matching instances should be consumed
      nominalBinary - true if, in the case where no user labels have been supplied, the new attribute should be a nominal binary one rather than numeric
      inputStructure - the incoming instances structure
      statusMessagePrefix - an optional status message prefix string for logging
      log - the log to use (may be null)
      env - environment variables
      Throws:
      Exception
    • SubstringLabelerRules

      public SubstringLabelerRules(String matchDetails, String newAttName, Instances inputStructure) throws Exception
      Constructor. Sets consume non matching to false and nominal binary to false. Initializes with system-wide environment variables. Initializes with no status message prefix and no log.
      Parameters:
      matchDetails - the internally encoded match details string.
      newAttName - the name of the new attribute that will be the label
      inputStructure - the incoming instances structure
      Throws:
      Exception
  • Method Details

    • setConsumeNonMatching

      public void setConsumeNonMatching(boolean n)
      Set whether to consume non matching instances. If false, then they will be passed through unaltered.
      Parameters:
      n - true then non-matching instances will be consumed (and only matching, and thus labelled, instances will be output)
    • getConsumeNonMatching

      public boolean getConsumeNonMatching()
      Get whether to consume non matching instances. If false, then they will be passed through unaltered.
      Returns:
      true then non-matching instances will be consumed (and only matching, and thus labelled, instances will be output)
    • setNominalBinary

      public void setNominalBinary(boolean n)
      Set whether to create a nominal binary attribute in the case when the user has not supplied an explicit label to use for each rule. If no labels are provided, then the output attribute is a binary indicator one (i.e. a rule matched or it didn't). This option allows that binary indicator to be coded as nominal rather than numeric
      Parameters:
      n - true if a binary indicator attribute should be nominal rather than numeric
    • getNominalBinary

      public boolean getNominalBinary()
      Get whether to create a nominal binary attribute in the case when the user has not supplied an explicit label to use for each rule. If no labels are provided, then the output attribute is a binary indicator one (i.e. a rule matched or it didn't). This option allows that binary indicator to be coded as nominal rather than numeric
      Returns:
      true if a binary indicator attribute should be nominal rather than numeric
    • getOutputStructure

      public Instances getOutputStructure()
      Get the output structure
      Returns:
      the structure of the output instances
    • getInputStructure

      public Instances getInputStructure()
      Get the input structure
      Returns:
      the structure of the input instances
    • setNewAttributeName

      public void setNewAttributeName(String newName)
      Set the name to use for the new attribute that is added
      Parameters:
      newName - the name to use
    • getNewAttributeName

      public String getNewAttributeName()
      Get the name to use for the new attribute that is added
      Returns:
      the name to use
    • setEnvironment

      public void setEnvironment(Environment env)
      Description copied from interface: EnvironmentHandler
      Set environment variables to use.
      Specified by:
      setEnvironment in interface EnvironmentHandler
      Parameters:
      env - the environment variables to use
    • matchRulesFromInternal

      public static List<SubstringLabelerRules.SubstringLabelerMatchRule> matchRulesFromInternal(String matchDetails, Instances inputStructure, String statusMessagePrefix, Logger log, Environment env)
      Get a list of match rules from an internally encoded match specification
      Parameters:
      matchDetails - the internally encoded specification of the match rules
      inputStructure - the input instances structure
      statusMessagePrefix - an optional status message prefix for logging
      log - the log to use
      env - environment variables
      Returns:
      a list of match rules
    • makeOutputInstance

      public Instance makeOutputInstance(Instance inputI, boolean batch) throws Exception
      Process and input instance and return an output instance
      Parameters:
      inputI - the incoming instance
      batch - whether this is being processed as part of a batch of instances
      Returns:
      the output instance
      Throws:
      Exception - if the output structure has not yet been determined