Class SnowballStemmer

java.lang.Object
weka.core.stemmers.SnowballStemmer
All Implemented Interfaces:
Serializable, OptionHandler, RevisionHandler, Stemmer

public class SnowballStemmer extends Object implements Stemmer, OptionHandler
A wrapper class for the Snowball stemmers. Only available if the Snowball classes are in the classpath.
If the class discovery is not dynamic, i.e., the property 'UseDynamic' in the props file 'weka/gui/GenericPropertiesCreator.props' is 'false', then the property 'org.tartarus.snowball.SnowballProgram' in the 'weka/gui/GenericObjectEditor.props' file has to be uncommented as well. If necessary you have to discover and fill in the snowball stemmers manually. You can use the 'weka.core.ClassDiscovery' for this:
java weka.core.ClassDiscovery org.tartarus.snowball.SnowballProgram org.tartarus.snowball.ext

Valid options are:

 -S <name>
  The name of the snowball stemmer (default 'porter').
  available stemmers:
     danish, dutch, english, finnish, french, german, italian, 
     norwegian, porter, portuguese, russian, spanish, swedish
 
Version:
$Revision: 15257 $
Author:
FracPete (fracpete at waikato dot ac dot nz)
See Also:
  • Field Details

  • Constructor Details

    • SnowballStemmer

      public SnowballStemmer()
      initializes the stemmer ("porter").
    • SnowballStemmer

      public SnowballStemmer(String name)
      initializes the stemmer with the given stemmer.
      Parameters:
      name - the name of the stemmer
  • Method Details

    • globalInfo

      public String globalInfo()
      Returns a string describing the stemmer.
      Returns:
      a description suitable for displaying in the explorer/experimenter gui
    • listOptions

      public Enumeration<Option> listOptions()
      Returns an enumeration describing the available options.
      Specified by:
      listOptions in interface OptionHandler
      Returns:
      an enumeration of all the available options.
    • setOptions

      public void setOptions(String[] options) throws Exception
      Parses the options.

      Valid options are:

       -S <name>
        The name of the snowball stemmer (default 'porter').
        available stemmers:
           danish, dutch, english, finnish, french, german, italian, 
           norwegian, porter, portuguese, russian, spanish, swedish
       
      Specified by:
      setOptions in interface OptionHandler
      Parameters:
      options - the options to parse
      Throws:
      Exception - if parsing fails
    • getOptions

      public String[] getOptions()
      Gets the current settings of the classifier.
      Specified by:
      getOptions in interface OptionHandler
      Returns:
      an array of strings suitable for passing to setOptions
    • isPresent

      public static boolean isPresent()
      returns whether Snowball is present or not, i.e. whether the classes are in the classpath or not
      Returns:
      whether Snowball is available
    • listStemmers

      public static Enumeration<String> listStemmers()
      returns an enumeration over all currently stored stemmer names.
      Returns:
      all available stemmers
    • getStemmer

      public String getStemmer()
      returns the name of the current stemmer, null if none is set.
      Returns:
      the name of the stemmer
    • setStemmer

      public void setStemmer(String name)
      sets the stemmer with the given name, e.g., "porter".
      Parameters:
      name - the name of the stemmer, e.g., "porter"
    • stemmerTipText

      public String stemmerTipText()
      Returns the tip text for this property.
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • stem

      public String stem(String word)
      Returns the word in its stemmed form.
      Specified by:
      stem in interface Stemmer
      Parameters:
      word - the unstemmed word
      Returns:
      the stemmed word
    • toString

      public String toString()
      returns a string representation of the stemmer.
      Overrides:
      toString in class Object
      Returns:
      a string representation of the stemmer
    • getRevision

      public String getRevision()
      Returns the revision string.
      Specified by:
      getRevision in interface RevisionHandler
      Returns:
      the revision
    • main

      public static void main(String[] args)
      Runs the stemmer with the given options.
      Parameters:
      args - the options