Class DataGenerator

java.lang.Object
weka.datagenerators.DataGenerator
All Implemented Interfaces:
Serializable, OptionHandler, Randomizable, RevisionHandler
Direct Known Subclasses:
ClassificationGenerator, ClusterGenerator, RegressionGenerator

public abstract class DataGenerator extends Object implements OptionHandler, Randomizable, Serializable, RevisionHandler
Abstract superclass for data generators that generate data for classifiers and clusterers.
Version:
$Revision: 15437 $
Author:
FracPete (fracpete at waikato dot ac dot nz)
See Also:
  • Constructor Details

    • DataGenerator

      public DataGenerator()
      initializes with default settings.
      Note: default values are set via a default<name> method. These default methods are also used in the listOptions method and in the setOptions method. Why? Derived generators can override the return value of these default methods, to avoid exceptions.
  • Method Details

    • listOptions

      public Enumeration<Option> listOptions()
      Returns an enumeration describing the available options.
      Specified by:
      listOptions in interface OptionHandler
      Returns:
      an enumeration of all the available options
    • enumToVector

      public Vector<Option> enumToVector(Enumeration<Option> enu)
      Convenience method. Turns the given enumeration of options into a vector.
    • setOptions

      public void setOptions(String[] options) throws Exception
      Parses a list of options for this object.

      For list of valid options see class description.

      Specified by:
      setOptions in interface OptionHandler
      Parameters:
      options - the list of options as an array of strings
      Throws:
      Exception - if an option is not supported
    • getOptions

      public String[] getOptions()
      Gets the current settings of the datagenerator RDG1. Removing of blacklisted options has to be done in the derived class, that defines the blacklist-entry.
      Specified by:
      getOptions in interface OptionHandler
      Returns:
      an array of strings suitable for passing to setOptions
      See Also:
      • removeBlacklist(String[])
    • defineDataFormat

      public Instances defineDataFormat() throws Exception
      Constructs the Instances object representing the format of the generated data. This default implementation simply returns the Instances object that holds the dataset format currently stored in m_DatasetFormat.
      Returns:
      the format for the dataset
      Throws:
      Exception - if the generating of the format failed
      See Also:
      • defaultRelationName()
    • generateExample

      public abstract Instance generateExample() throws Exception
      Generates one example of the dataset.
      Returns:
      the generated example
      Throws:
      Exception - if the format of the dataset is not yet defined
      Exception - if the generator only works with generateExamples which means in non single mode
    • generateExamples

      public abstract Instances generateExamples() throws Exception
      Generates all examples of the dataset.
      Returns:
      the generated dataset
      Throws:
      Exception - if the format of the dataset is not yet defined
      Exception - if the generator only works with generateExample, which means in single mode
    • generateStart

      public abstract String generateStart() throws Exception
      Generates a comment string that documentates the data generator. By default this string is added at the beginning of the produced output as ARFF file type, next after the options.
      Returns:
      string contains info about the generated rules
      Throws:
      Exception - if the generating of the documentation fails
    • generateFinished

      public abstract String generateFinished() throws Exception
      Generates a comment string that documentates the data generator. By default this string is added at the end of the produced output as ARFF file type.
      Returns:
      string contains info about the generated rules
      Throws:
      Exception - if the generating of the documentation fails
    • getSingleModeFlag

      public abstract boolean getSingleModeFlag() throws Exception
      Return if single mode is set for the given data generator mode depends on option setting and or generator type.
      Returns:
      single mode flag
      Throws:
      Exception - if mode is not set yet
    • setDebug

      public void setDebug(boolean debug)
      Sets the debug flag.
      Parameters:
      debug - the new debug flag
    • getDebug

      public boolean getDebug()
      Gets the debug flag.
      Returns:
      the debug flag
    • debugTipText

      public String debugTipText()
      Returns the tip text for this property
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • setRelationName

      public void setRelationName(String relationName)
      Sets the relation name the dataset should have.
      Parameters:
      relationName - the new relation name
    • getRelationName

      public String getRelationName()
      Gets the relation name the dataset should have.
      Returns:
      the relation name the dataset should have
    • relationNameTipText

      public String relationNameTipText()
      Returns the tip text for this property
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • getNumExamplesAct

      public int getNumExamplesAct()
      Gets the number of examples the dataset should have.
      Returns:
      the number of examples the dataset should have
    • setOutput

      public void setOutput(PrintWriter newOutput)
      Sets the print writer.
      Parameters:
      newOutput - the new print writer
    • getOutput

      public PrintWriter getOutput()
      Gets the print writer.
      Returns:
      print writer object
    • defaultOutput

      public PrintWriter defaultOutput()
      Gets writer, which is used for outputting to stdout. A workaround for the problem of closing stdout when closing the associated Printwriter.
      Returns:
      writer object
    • outputTipText

      public String outputTipText()
      Returns the tip text for this property
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • setDatasetFormat

      public void setDatasetFormat(Instances newFormat)
      Sets the format of the dataset that is to be generated.
      Parameters:
      newFormat - the new dataset format of the dataset
    • getDatasetFormat

      public Instances getDatasetFormat()
      Gets the format of the dataset that is to be generated.
      Returns:
      the dataset format of the dataset
    • formatTipText

      public String formatTipText()
      Returns the tip text for this property
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • getSeed

      public int getSeed()
      Gets the random number seed.
      Specified by:
      getSeed in interface Randomizable
      Returns:
      the random number seed.
    • setSeed

      public void setSeed(int newSeed)
      Sets the random number seed.
      Specified by:
      setSeed in interface Randomizable
      Parameters:
      newSeed - the new random number seed.
    • seedTipText

      public String seedTipText()
      Returns the tip text for this property
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • getRandom

      public Random getRandom()
      Gets the random generator.
      Returns:
      the random generator
    • setRandom

      public void setRandom(Random newRandom)
      Sets the random generator.
      Parameters:
      newRandom - is the random generator.
    • randomTipText

      public String randomTipText()
      Returns the tip text for this property
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • getPrologue

      public String getPrologue() throws Exception
      Gets the prologue string.
      Returns:
      prologue
      Throws:
      Exception
    • getEpilogue

      public String getEpilogue() throws Exception
      Gets the epilogue string.
      Returns:
      epilogue
      Throws:
      Exception
    • makeData

      public static void makeData(DataGenerator generator, String[] options) throws Exception
      Calls the data generator.
      Parameters:
      generator - one of the data generators
      options - options of the data generator
      Throws:
      Exception - if there was an error in the option list
    • runDataGenerator

      public static void runDataGenerator(DataGenerator datagenerator, String[] options)
      runs the datagenerator instance with the given options.
      Parameters:
      datagenerator - the datagenerator to run
      options - the commandline options