Class SubstringReplacer

java.lang.Object
weka.knowledgeflow.steps.BaseStep
weka.knowledgeflow.steps.SubstringReplacer
All Implemented Interfaces:
Serializable, BaseStepExtender, Step

@KFStep(name="SubstringReplacer", category="Tools", toolTipText="Replace substrings in String attribute values using either literal match-and-replace or regular expression matching. The attributes to apply the match and replace rules to can be selected via a range string (e.g. 1-5,6-last) or by a comma-separated list of attribute names (/first and /last can be used to indicate the first and last attribute respectively)", iconPath="weka/gui/knowledgeflow/icons/DefaultFilter.gif") public class SubstringReplacer extends BaseStep
A step that can replace sub-strings in the values of string attributes. Only operates in streaming mode. Multiple match and replace "rules" can be specified - these get applied in the order that they are defined. Each rule can be applied to one or more user-specified input String attributes. Attributes can be specified using either a range list (e.g 1,2-10,last) or by a comma separated list of attribute names (where "/first" and "/last" are special strings indicating the first and last attribute respectively).
Version:
$Revision: $
Author:
Mark Hall (mhall{[at]}pentaho{[dot]}com)
See Also:
  • Constructor Details

    • SubstringReplacer

      public SubstringReplacer()
  • Method Details

    • setMatchReplaceDetails

      @ProgrammaticProperty public void setMatchReplaceDetails(String details)
      Set internally encoded list of match-replace rules
      Parameters:
      details - the list of match-replace rules
    • getMatchReplaceDetails

      public String getMatchReplaceDetails()
      Get the internally encoded list of match-replace rules
      Returns:
      the match-replace rules
    • stepInit

      public void stepInit() throws WekaException
      Initialize the step
      Throws:
      WekaException - if a problem occurs
    • getIncomingConnectionTypes

      public List<String> getIncomingConnectionTypes()
      Get a list of incoming connection types that this step can accept. Ideally (and if appropriate), this should take into account the state of the step and any existing incoming connections. E.g. a step might be able to accept one (and only one) incoming batch data connection.
      Returns:
      a list of incoming connections that this step can accept given its current state
    • getOutgoingConnectionTypes

      public List<String> getOutgoingConnectionTypes()
      Get a list of outgoing connection types that this step can produce. Ideally (and if appropriate), this should take into account the state of the step and the incoming connections. E.g. depending on what incoming connection is present, a step might be able to produce a trainingSet output, a testSet output or neither, but not both.
      Returns:
      a list of outgoing connections that this step can produce
    • processIncoming

      public void processIncoming(Data data) throws WekaException
      Process an incoming data payload (if the step accepts incoming connections)
      Specified by:
      processIncoming in interface BaseStepExtender
      Specified by:
      processIncoming in interface Step
      Overrides:
      processIncoming in class BaseStep
      Parameters:
      data - the data to process
      Throws:
      WekaException - if a problem occurs
    • outputStructureForConnectionType

      public Instances outputStructureForConnectionType(String connectionName) throws WekaException
      If possible, get the output structure for the named connection type as a header-only set of instances. Can return null if the specified connection type is not representable as Instances or cannot be determined at present.
      Specified by:
      outputStructureForConnectionType in interface Step
      Overrides:
      outputStructureForConnectionType in class BaseStep
      Parameters:
      connectionName - the name of the connection type to get the output structure for
      Returns:
      the output structure as a header-only Instances object
      Throws:
      WekaException - if a problem occurs
    • getCustomEditorForStep

      public String getCustomEditorForStep()
      Return the fully qualified name of a custom editor component (JComponent) to use for editing the properties of the step. This method can return null, in which case the system will dynamically generate an editor using the GenericObjectEditor
      Specified by:
      getCustomEditorForStep in interface Step
      Overrides:
      getCustomEditorForStep in class BaseStep
      Returns:
      the fully qualified name of a step editor component