All Implemented Interfaces:
Serializable, OptionHandler, Randomizable, RevisionHandler, TechnicalInformationHandler

public class Agrawal extends ClassificationGenerator implements TechnicalInformationHandler
Generates a people database and is based on the paper by Agrawal et al.:
R. Agrawal, T. Imielinski, A. Swami (1993). Database Mining: A Performance Perspective. IEEE Transactions on Knowledge and Data Engineering. 5(6):914-925. URL http://www.almaden.ibm.com/software/quest/Publications/ByDate.html.

BibTeX:

 @article{Agrawal1993,
    author = {R. Agrawal and T. Imielinski and A. Swami},
    journal = {IEEE Transactions on Knowledge and Data Engineering},
    note = {Special issue on Learning and Discovery in Knowledge-Based Databases},
    number = {6},
    pages = {914-925},
    title = {Database Mining: A Performance Perspective},
    volume = {5},
    year = {1993},
    URL = {http://www.almaden.ibm.com/software/quest/Publications/ByDate.html},
    PDF = {http://www.almaden.ibm.com/software/quest/Publications/papers/tkde93.pdf}
 }
 

Valid options are:

 -h
  Prints this help.
 
 -o <file>
  The name of the output file, otherwise the generated data is
  printed to stdout.
 
 -r <name>
  The name of the relation.
 
 -d
  Whether to print debug informations.
 
 -S
  The seed for random function (default 1)
 
 -n <num>
  The number of examples to generate (default 100)
 
 -F <num>
  The function to use for generating the data. (default 1)
 
 -B
  Whether to balance the class.
 
 -P <num>
  The perturbation factor. (default 0.05)
 
Version:
$Revision: 10203 $
Author:
Richard Kirkby (rkirkby at cs dot waikato dot ac dot nz), FracPete (fracpete at waikato dot ac dot nz)
See Also:
  • Field Details

  • Constructor Details

    • Agrawal

      public Agrawal()
      initializes the generator with default values
  • Method Details

    • globalInfo

      public String globalInfo()
      Returns a string describing this data generator.
      Returns:
      a description of the data generator suitable for displaying in the explorer/experimenter gui
    • getTechnicalInformation

      public TechnicalInformation getTechnicalInformation()
      Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.
      Specified by:
      getTechnicalInformation in interface TechnicalInformationHandler
      Returns:
      the technical information about this class
    • listOptions

      public Enumeration<Option> listOptions()
      Returns an enumeration describing the available options.
      Specified by:
      listOptions in interface OptionHandler
      Overrides:
      listOptions in class ClassificationGenerator
      Returns:
      an enumeration of all the available options
    • setOptions

      public void setOptions(String[] options) throws Exception
      Parses a list of options for this object.

      Valid options are:

       -h
        Prints this help.
       
       -o <file>
        The name of the output file, otherwise the generated data is
        printed to stdout.
       
       -r <name>
        The name of the relation.
       
       -d
        Whether to print debug informations.
       
       -S
        The seed for random function (default 1)
       
       -n <num>
        The number of examples to generate (default 100)
       
       -F <num>
        The function to use for generating the data. (default 1)
       
       -B
        Whether to balance the class.
       
       -P <num>
        The perturbation factor. (default 0.05)
       
      Specified by:
      setOptions in interface OptionHandler
      Overrides:
      setOptions in class ClassificationGenerator
      Parameters:
      options - the list of options as an array of strings
      Throws:
      Exception - if an option is not supported
    • getOptions

      public String[] getOptions()
      Gets the current settings of the datagenerator.
      Specified by:
      getOptions in interface OptionHandler
      Overrides:
      getOptions in class ClassificationGenerator
      Returns:
      an array of strings suitable for passing to setOptions
      See Also:
      • DataGenerator.removeBlacklist(String[])
    • getFunction

      public SelectedTag getFunction()
      Gets the function for generating the data.
      Returns:
      the function.
      See Also:
    • setFunction

      public void setFunction(SelectedTag value)
      Sets the function for generating the data.
      Parameters:
      value - the function.
      See Also:
    • functionTipText

      public String functionTipText()
      Returns the tip text for this property
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • getBalanceClass

      public boolean getBalanceClass()
      Gets whether the class is balanced.
      Returns:
      whether the class is balanced.
    • setBalanceClass

      public void setBalanceClass(boolean value)
      Sets whether the class is balanced.
      Parameters:
      value - whether to balance the class.
    • balanceClassTipText

      public String balanceClassTipText()
      Returns the tip text for this property
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • getPerturbationFraction

      public double getPerturbationFraction()
      Gets the perturbation fraction.
      Returns:
      the perturbation fraction.
    • setPerturbationFraction

      public void setPerturbationFraction(double value)
      Sets the perturbation fraction.
      Parameters:
      value - the perturbation fraction.
    • perturbationFractionTipText

      public String perturbationFractionTipText()
      Returns the tip text for this property
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • getSingleModeFlag

      public boolean getSingleModeFlag() throws Exception
      Return if single mode is set for the given data generator mode depends on option setting and or generator type.
      Specified by:
      getSingleModeFlag in class DataGenerator
      Returns:
      single mode flag
      Throws:
      Exception - if mode is not set yet
    • defineDataFormat

      public Instances defineDataFormat() throws Exception
      Initializes the format for the dataset produced. Must be called before the generateExample or generateExamples methods are used. Re-initializes the random number generator with the given seed.
      Overrides:
      defineDataFormat in class DataGenerator
      Returns:
      the format for the dataset
      Throws:
      Exception - if the generating of the format failed
      See Also:
    • generateExample

      public Instance generateExample() throws Exception
      Generates one example of the dataset.
      Specified by:
      generateExample in class DataGenerator
      Returns:
      the generated example
      Throws:
      Exception - if the format of the dataset is not yet defined
      Exception - if the generator only works with generateExamples which means in non single mode
    • generateExamples

      public Instances generateExamples() throws Exception
      Generates all examples of the dataset. Re-initializes the random number generator with the given seed, before generating instances.
      Specified by:
      generateExamples in class DataGenerator
      Returns:
      the generated dataset
      Throws:
      Exception - if the format of the dataset is not yet defined
      Exception - if the generator only works with generateExample, which means in single mode
      See Also:
    • generateStart

      public String generateStart()
      Generates a comment string that documentates the data generator. By default this string is added at the beginning of the produced output as ARFF file type, next after the options.
      Specified by:
      generateStart in class DataGenerator
      Returns:
      string contains info about the generated rules
    • generateFinished

      public String generateFinished() throws Exception
      Generates a comment string that documentats the data generator. By default this string is added at the end of theproduces output as ARFF file type.
      Specified by:
      generateFinished in class DataGenerator
      Returns:
      string contains info about the generated rules
      Throws:
      Exception - if the generating of the documentaion fails
    • getRevision

      public String getRevision()
      Returns the revision string.
      Specified by:
      getRevision in interface RevisionHandler
      Returns:
      the revision
    • main

      public static void main(String[] args)
      Main method for executing this class.
      Parameters:
      args - should contain arguments for the data producer: