Package weka.core

Class Attribute

java.lang.Object
weka.core.Attribute
All Implemented Interfaces:
Serializable, Copyable, RevisionHandler

public class Attribute extends Object implements Copyable, Serializable, RevisionHandler
Class for handling an attribute. Once an attribute has been created, it can't be changed.

The following attribute types are supported:

  • numeric:
    This type of attribute represents a floating-point number.
  • nominal:
    This type of attribute represents a fixed set of nominal values.
  • string:
    This type of attribute represents a dynamically expanding set of nominal values. Usually used in text classification.
  • date:
    This type of attribute represents a date, internally represented as floating-point number storing the milliseconds since January 1, 1970, 00:00:00 GMT. The string representation of the date must be ISO-8601 compliant, the default is yyyy-MM-dd'T'HH:mm:ss.
  • relational:
    This type of attribute can contain other attributes and is, e.g., used for representing Multi-Instance data. (Multi-Instance data consists of a nominal attribute containing the bag-id, then a relational attribute with all the attributes of the bag, and finally the class attribute.)
Typical usage (code from the main() method of this class):

...
// Create numeric attributes "length" and "weight"
Attribute length = new Attribute("length");
Attribute weight = new Attribute("weight");

// Create list to hold nominal values "first", "second", "third"
List my_nominal_values = new ArrayList(3);
my_nominal_values.add("first");
my_nominal_values.add("second");
my_nominal_values.add("third");

// Create nominal attribute "position"
Attribute position = new Attribute("position", my_nominal_values);
...

Version:
$Revision: 14509 $
Author:
Eibe Frank (eibe@cs.waikato.ac.nz)
See Also:
  • Field Summary

    Fields
    Modifier and Type
    Field
    Description
    static final String
    The keyword used to denote the start of an arff attribute declaration
    static final String
    The keyword used to denote a date attribute
    static final String
    A keyword used to denote a numeric attribute
    static final String
    A keyword used to denote a numeric attribute
    static final String
    A keyword used to denote a numeric attribute
    static final String
    The keyword used to denote a relation-valued attribute
    static final String
    The keyword used to denote a string attribute
    static final String
    The keyword used to denote the end of the declaration of a subrelation
    static final int
    Constant set for attributes with date values.
    static final int
    Constant set for nominal attributes.
    static final int
    Constant set for numeric attributes.
    static final int
    Constant set for modulo-ordered attributes.
    static final int
    Constant set for ordered attributes.
    static final int
    Constant set for symbolic attributes.
    static final int
    Constant set for relation-valued attributes.
    static final int
    Constant set for attributes with string values.
  • Constructor Summary

    Constructors
    Constructor
    Description
    Attribute(String attributeName)
    Constructor for a numeric attribute.
    Attribute(String attributeName, boolean createStringAttribute)
    Constructor for a numeric or string attribute.
    Attribute(String attributeName, boolean createStringAttribute, ProtectedProperties metadata)
    Constructor for a numeric or string attribute, where metadata is supplied.
    Attribute(String attributeName, int index)
    Constructor for a numeric attribute with a particular index.
    Attribute(String attributeName, String dateFormat)
    Constructor for a date attribute.
    Attribute(String attributeName, String dateFormat, int index)
    Constructor for date attributes with a particular index.
    Attribute(String attributeName, String dateFormat, ProtectedProperties metadata)
    Constructor for a date attribute, where metadata is supplied.
    Attribute(String attributeName, List<String> attributeValues)
    Constructor for nominal attributes and string attributes.
    Attribute(String attributeName, List<String> attributeValues, int index)
    Constructor for nominal attributes and string attributes with a particular index.
    Attribute(String attributeName, List<String> attributeValues, ProtectedProperties metadata)
    Constructor for nominal attributes and string attributes, where metadata is supplied.
    Attribute(String attributeName, Instances header)
    Constructor for relation-valued attributes.
    Attribute(String attributeName, Instances header, int index)
    Constructor for a relation-valued attribute with a particular index.
    Attribute(String attributeName, Instances header, ProtectedProperties metadata)
    Constructor for relation-valued attributes.
    Attribute(String attributeName, ProtectedProperties metadata)
    Constructor for a numeric attribute, where metadata is supplied.
  • Method Summary

    Modifier and Type
    Method
    Description
    int
    Adds a relation to a relation-valued attribute.
    int
    Adds a string value to the list of valid strings for attributes of type STRING and returns the index of the string.
    int
    addStringValue(Attribute src, int index)
    Adds a string value to the list of valid strings for attributes of type STRING and returns the index of the string.
    Produces a shallow copy of this attribute.
    final Attribute
    copy(String newName)
    Produces a shallow copy of this attribute with a new name.
    Returns an enumeration of all the attribute's values if the attribute is nominal, string, or relation-valued, null otherwise.
    final boolean
    equals(Object other)
    Tests if given attribute is equal to this attribute.
    final String
    Tests if given attribute is equal to this attribute.
    formatDate(double date)
    Returns the given amount of milliseconds formatted according to the current Date format.
    final String
    Returns the Date format pattern in case this attribute is of type DATE, otherwise an empty string.
    final double
    Returns the lower bound of a numeric attribute.
    Returns the properties supplied for this attribute.
    Returns the revision string.
    final double
    Returns the upper bound of a numeric attribute.
    final int
    Returns a hash code for this attribute based on its name.
    final boolean
    Returns whether the attribute has a zeropoint and may be added meaningfully.
    final int
    Returns the index of this attribute.
    final int
    Returns the index of a given attribute value.
    final boolean
    Returns whether the attribute can be averaged meaningfully.
    final boolean
    Tests if the attribute is a date type.
    final boolean
    isInRange(double value)
    Determines whether a value lies within the bounds of the attribute.
    final boolean
    Test if the attribute is nominal.
    final boolean
    Tests if the attribute is numeric.
    final boolean
    Returns whether the attribute values are equally spaced.
    final boolean
    Tests if the attribute is relation valued.
    final boolean
    Tests if the attribute is a string.
    final boolean
    Returns whether the lower numeric bound of the attribute is open.
    static void
    main(String[] ops)
    Simple main method for testing this class.
    final String
    Returns the attribute's name.
    final int
    Returns the number of attribute values.
    final int
    Returns the ordering of the attribute.
    double
    parseDate(String string)
    Parses the given String as Date, according to the current format and returns the corresponding amount of milliseconds.
    final Instances
    Returns the header info for a relation-valued attribute, null if the attribute is not relation-valued.
    final Instances
    relation(int valIndex)
    Returns a value of a relation-valued attribute.
    void
    Clear the map and list of values and set them to contain just the supplied value
    void
    setWeight(double value)
    Sets the new attribute's weight.
    final String
    Returns a description of this attribute in ARFF format.
    final int
    Returns the attribute's type as an integer.
    static String
    typeToString(int type)
    Returns a string representation of the attribute type.
    static String
    Returns a string representation of the attribute type.
    static String
    Returns a short string representation of the attribute type.
    static String
    Returns a short string representation of the attribute type.
    final boolean
    Returns whether the upper numeric bound of the attribute is open.
    final String
    value(int valIndex)
    Returns a value of a nominal or string attribute.
    final double
    Returns the attribute's weight.

    Methods inherited from class java.lang.Object

    getClass, notify, notifyAll, wait, wait, wait
  • Field Details

    • NUMERIC

      public static final int NUMERIC
      Constant set for numeric attributes.
      See Also:
    • NOMINAL

      public static final int NOMINAL
      Constant set for nominal attributes.
      See Also:
    • STRING

      public static final int STRING
      Constant set for attributes with string values.
      See Also:
    • DATE

      public static final int DATE
      Constant set for attributes with date values.
      See Also:
    • RELATIONAL

      public static final int RELATIONAL
      Constant set for relation-valued attributes.
      See Also:
    • ORDERING_SYMBOLIC

      public static final int ORDERING_SYMBOLIC
      Constant set for symbolic attributes.
      See Also:
    • ORDERING_ORDERED

      public static final int ORDERING_ORDERED
      Constant set for ordered attributes.
      See Also:
    • ORDERING_MODULO

      public static final int ORDERING_MODULO
      Constant set for modulo-ordered attributes.
      See Also:
    • ARFF_ATTRIBUTE

      public static final String ARFF_ATTRIBUTE
      The keyword used to denote the start of an arff attribute declaration
      See Also:
    • ARFF_ATTRIBUTE_INTEGER

      public static final String ARFF_ATTRIBUTE_INTEGER
      A keyword used to denote a numeric attribute
      See Also:
    • ARFF_ATTRIBUTE_REAL

      public static final String ARFF_ATTRIBUTE_REAL
      A keyword used to denote a numeric attribute
      See Also:
    • ARFF_ATTRIBUTE_NUMERIC

      public static final String ARFF_ATTRIBUTE_NUMERIC
      A keyword used to denote a numeric attribute
      See Also:
    • ARFF_ATTRIBUTE_STRING

      public static final String ARFF_ATTRIBUTE_STRING
      The keyword used to denote a string attribute
      See Also:
    • ARFF_ATTRIBUTE_DATE

      public static final String ARFF_ATTRIBUTE_DATE
      The keyword used to denote a date attribute
      See Also:
    • ARFF_ATTRIBUTE_RELATIONAL

      public static final String ARFF_ATTRIBUTE_RELATIONAL
      The keyword used to denote a relation-valued attribute
      See Also:
    • ARFF_END_SUBRELATION

      public static final String ARFF_END_SUBRELATION
      The keyword used to denote the end of the declaration of a subrelation
      See Also:
  • Constructor Details

    • Attribute

      public Attribute(String attributeName)
      Constructor for a numeric attribute.
      Parameters:
      attributeName - the name for the attribute
    • Attribute

      public Attribute(String attributeName, ProtectedProperties metadata)
      Constructor for a numeric attribute, where metadata is supplied.
      Parameters:
      attributeName - the name for the attribute
      metadata - the attribute's properties
    • Attribute

      public Attribute(String attributeName, boolean createStringAttribute)
      Constructor for a numeric or string attribute. Provides an alternative way for creating string attributes.
      Parameters:
      attributeName - the name for the attribute
      createStringAttribute - if true, a string attribute will be created, otherwise a numeric one.
    • Attribute

      public Attribute(String attributeName, boolean createStringAttribute, ProtectedProperties metadata)
      Constructor for a numeric or string attribute, where metadata is supplied. Provides an alternative way for creating string attributes.
      Parameters:
      attributeName - the name for the attribute
      createStringAttribute - if true, a string attribute will be created, otherwise a numeric one.
      metadata - the attribute's properties
    • Attribute

      public Attribute(String attributeName, String dateFormat)
      Constructor for a date attribute.
      Parameters:
      attributeName - the name for the attribute
      dateFormat - a string suitable for use with SimpleDateFormatter for parsing dates.
    • Attribute

      public Attribute(String attributeName, String dateFormat, ProtectedProperties metadata)
      Constructor for a date attribute, where metadata is supplied.
      Parameters:
      attributeName - the name for the attribute
      dateFormat - a string suitable for use with SimpleDateFormatter for parsing dates.
      metadata - the attribute's properties
    • Attribute

      public Attribute(String attributeName, List<String> attributeValues)
      Constructor for nominal attributes and string attributes. If a null vector of attribute values is passed to the method, the attribute is assumed to be a string.
      Parameters:
      attributeName - the name for the attribute
      attributeValues - a vector of strings denoting the attribute values. Null if the attribute is a string attribute.
    • Attribute

      public Attribute(String attributeName, List<String> attributeValues, ProtectedProperties metadata)
      Constructor for nominal attributes and string attributes, where metadata is supplied. If a null vector of attribute values is passed to the method, the attribute is assumed to be a string.
      Parameters:
      attributeName - the name for the attribute
      attributeValues - a vector of strings denoting the attribute values. Null if the attribute is a string attribute.
      metadata - the attribute's properties
    • Attribute

      public Attribute(String attributeName, Instances header)
      Constructor for relation-valued attributes.
      Parameters:
      attributeName - the name for the attribute
      header - an Instances object specifying the header of the relation.
    • Attribute

      public Attribute(String attributeName, Instances header, ProtectedProperties metadata)
      Constructor for relation-valued attributes.
      Parameters:
      attributeName - the name for the attribute
      header - an Instances object specifying the header of the relation.
      metadata - the attribute's properties
    • Attribute

      public Attribute(String attributeName, int index)
      Constructor for a numeric attribute with a particular index.
      Parameters:
      attributeName - the name for the attribute
      index - the attribute's index
    • Attribute

      public Attribute(String attributeName, String dateFormat, int index)
      Constructor for date attributes with a particular index.
      Parameters:
      attributeName - the name for the attribute
      dateFormat - a string suitable for use with SimpleDateFormatter for parsing dates. Null for a default format string.
      index - the attribute's index
    • Attribute

      public Attribute(String attributeName, List<String> attributeValues, int index)
      Constructor for nominal attributes and string attributes with a particular index. If a null vector of attribute values is passed to the method, the attribute is assumed to be a string.
      Parameters:
      attributeName - the name for the attribute
      attributeValues - a vector of strings denoting the attribute values. Null if the attribute is a string attribute.
      index - the attribute's index
    • Attribute

      public Attribute(String attributeName, Instances header, int index)
      Constructor for a relation-valued attribute with a particular index.
      Parameters:
      attributeName - the name for the attribute
      header - the header information for this attribute
      index - the attribute's index
  • Method Details

    • copy

      public Object copy()
      Produces a shallow copy of this attribute.
      Specified by:
      copy in interface Copyable
      Returns:
      a copy of this attribute with the same index
    • enumerateValues

      public final Enumeration<Object> enumerateValues()
      Returns an enumeration of all the attribute's values if the attribute is nominal, string, or relation-valued, null otherwise.
      Returns:
      enumeration of all the attribute's values
    • equals

      public final boolean equals(Object other)
      Tests if given attribute is equal to this attribute. Attribute indices are ignored in the comparison.
      Overrides:
      equals in class Object
      Parameters:
      other - the Object to be compared to this attribute
      Returns:
      true if the given attribute is equal to this attribute
    • hashCode

      public final int hashCode()
      Returns a hash code for this attribute based on its name.
      Overrides:
      hashCode in class Object
      Returns:
      the hash code
    • equalsMsg

      public final String equalsMsg(Object other)
      Tests if given attribute is equal to this attribute. If they're not the same a message detailing why they differ will be returned, otherwise null. Attribute indices are ignored in the comparison.
      Parameters:
      other - the Object to be compared to this attribute
      Returns:
      null if the given attribute is equal to this attribute
    • typeToString

      public static String typeToString(Attribute att)
      Returns a string representation of the attribute type.
      Parameters:
      att - the attribute to return the type string for
      Returns:
      the string representation of the attribute type
    • typeToString

      public static String typeToString(int type)
      Returns a string representation of the attribute type.
      Parameters:
      type - the type of the attribute
      Returns:
      the string representation of the attribute type
    • typeToStringShort

      public static String typeToStringShort(Attribute att)
      Returns a short string representation of the attribute type.
      Parameters:
      att - the attribute to return the type string for
      Returns:
      the string representation of the attribute type
    • typeToStringShort

      public static String typeToStringShort(int type)
      Returns a short string representation of the attribute type.
      Parameters:
      type - the type of the attribute
      Returns:
      the string representation of the attribute type
    • index

      public final int index()
      Returns the index of this attribute.
      Returns:
      the index of this attribute
    • indexOfValue

      public final int indexOfValue(String value)
      Returns the index of a given attribute value. (The index of the first occurence of this value.)
      Parameters:
      value - the value for which the index is to be returned
      Returns:
      the index of the given attribute value if attribute is nominal or a string, -1 if it is not or the value can't be found
    • isNominal

      public final boolean isNominal()
      Test if the attribute is nominal.
      Returns:
      true if the attribute is nominal
    • isNumeric

      public final boolean isNumeric()
      Tests if the attribute is numeric.
      Returns:
      true if the attribute is numeric
    • isRelationValued

      public final boolean isRelationValued()
      Tests if the attribute is relation valued.
      Returns:
      true if the attribute is relation valued
    • isString

      public final boolean isString()
      Tests if the attribute is a string.
      Returns:
      true if the attribute is a string
    • isDate

      public final boolean isDate()
      Tests if the attribute is a date type.
      Returns:
      true if the attribute is a date type
    • name

      public final String name()
      Returns the attribute's name.
      Returns:
      the attribute's name as a string
    • numValues

      public final int numValues()
      Returns the number of attribute values. Returns 0 for attributes that are not either nominal, string, or relation-valued.
      Returns:
      the number of attribute values
    • toString

      public final String toString()
      Returns a description of this attribute in ARFF format. Quotes strings if they contain whitespace characters, or if they are a question mark.
      Overrides:
      toString in class Object
      Returns:
      a description of this attribute as a string
    • type

      public final int type()
      Returns the attribute's type as an integer.
      Returns:
      the attribute's type.
    • getDateFormat

      public final String getDateFormat()
      Returns the Date format pattern in case this attribute is of type DATE, otherwise an empty string.
      Returns:
      the date format pattern
      See Also:
    • value

      public final String value(int valIndex)
      Returns a value of a nominal or string attribute. Returns an empty string if the attribute is neither a string nor a nominal attribute.
      Parameters:
      valIndex - the value's index
      Returns:
      the attribute's value as a string
    • relation

      public final Instances relation()
      Returns the header info for a relation-valued attribute, null if the attribute is not relation-valued.
      Returns:
      the attribute's value as an Instances object
    • relation

      public final Instances relation(int valIndex)
      Returns a value of a relation-valued attribute. Returns null if the attribute is not relation-valued.
      Parameters:
      valIndex - the value's index
      Returns:
      the attribute's value as an Instances object
    • addStringValue

      public int addStringValue(String value)
      Adds a string value to the list of valid strings for attributes of type STRING and returns the index of the string.
      Parameters:
      value - The string value to add
      Returns:
      the index assigned to the string, or -1 if the attribute is not of type Attribute.STRING
    • setStringValue

      public void setStringValue(String value)
      Clear the map and list of values and set them to contain just the supplied value
      Parameters:
      value - the current (and only) value of this String attribute. If null then just the map is cleared.
    • addStringValue

      public int addStringValue(Attribute src, int index)
      Adds a string value to the list of valid strings for attributes of type STRING and returns the index of the string. This method is more efficient than addStringValue(String) for long strings.
      Parameters:
      src - The Attribute containing the string value to add.
      index - the index of the string value in the source attribute.
      Returns:
      the index assigned to the string, or -1 if the attribute is not of type Attribute.STRING
    • addRelation

      public int addRelation(Instances value)
      Adds a relation to a relation-valued attribute.
      Parameters:
      value - The value to add
      Returns:
      the index assigned to the value, or -1 if the attribute is not of type Attribute.RELATIONAL
    • copy

      public final Attribute copy(String newName)
      Produces a shallow copy of this attribute with a new name.
      Parameters:
      newName - the name of the new attribute
      Returns:
      a copy of this attribute with the same index
    • formatDate

      public String formatDate(double date)
      Returns the given amount of milliseconds formatted according to the current Date format.
      Parameters:
      date - the date, represented in milliseconds since January 1, 1970, 00:00:00 GMT, to return as string
      Returns:
      the formatted date
    • parseDate

      public double parseDate(String string) throws ParseException
      Parses the given String as Date, according to the current format and returns the corresponding amount of milliseconds.
      Parameters:
      string - the date to parse
      Returns:
      the date in milliseconds since January 1, 1970, 00:00:00 GMT
      Throws:
      ParseException - if parsing fails
    • getMetadata

      public final ProtectedProperties getMetadata()
      Returns the properties supplied for this attribute. Returns null if there is no meta data for this attribute.
      Returns:
      metadata for this attribute
    • ordering

      public final int ordering()
      Returns the ordering of the attribute. One of the following: ORDERING_SYMBOLIC - attribute values should be treated as symbols. ORDERING_ORDERED - attribute values have a global ordering. ORDERING_MODULO - attribute values have an ordering which wraps.
      Returns:
      the ordering type of the attribute
    • isRegular

      public final boolean isRegular()
      Returns whether the attribute values are equally spaced.
      Returns:
      whether the attribute is regular or not
    • isAveragable

      public final boolean isAveragable()
      Returns whether the attribute can be averaged meaningfully.
      Returns:
      whether the attribute can be averaged or not
    • hasZeropoint

      public final boolean hasZeropoint()
      Returns whether the attribute has a zeropoint and may be added meaningfully.
      Returns:
      whether the attribute has a zeropoint or not
    • weight

      public final double weight()
      Returns the attribute's weight.
      Returns:
      the attribute's weight as a double
    • setWeight

      public void setWeight(double value)
      Sets the new attribute's weight. Does not modify the weight info stored in the attribute's meta data object!
      Parameters:
      value - the new weight
    • getLowerNumericBound

      public final double getLowerNumericBound()
      Returns the lower bound of a numeric attribute.
      Returns:
      the lower bound of the specified numeric range
    • lowerNumericBoundIsOpen

      public final boolean lowerNumericBoundIsOpen()
      Returns whether the lower numeric bound of the attribute is open.
      Returns:
      whether the lower numeric bound is open or not (closed)
    • getUpperNumericBound

      public final double getUpperNumericBound()
      Returns the upper bound of a numeric attribute.
      Returns:
      the upper bound of the specified numeric range
    • upperNumericBoundIsOpen

      public final boolean upperNumericBoundIsOpen()
      Returns whether the upper numeric bound of the attribute is open.
      Returns:
      whether the upper numeric bound is open or not (closed)
    • isInRange

      public final boolean isInRange(double value)
      Determines whether a value lies within the bounds of the attribute.
      Parameters:
      value - the value to check
      Returns:
      whether the value is in range
    • getRevision

      public String getRevision()
      Returns the revision string.
      Specified by:
      getRevision in interface RevisionHandler
      Returns:
      the revision
    • main

      public static void main(String[] ops)
      Simple main method for testing this class.
      Parameters:
      ops - the commandline options