Class RegressionAnalysis

java.lang.Object
weka.classifiers.evaluation.RegressionAnalysis

public class RegressionAnalysis extends Object
Analyzes linear regression model by using the Student's t-test on each coefficient. Also calculates R^2 value and F-test value. More information: http://en.wikipedia.org/wiki/Student's_t-test http://en.wikipedia.org/wiki/Linear_regression http://en.wikipedia.org/wiki/Ordinary_least_squares
Version:
$Revision: $
Author:
Chris Meyer: cmeyer@udel.edu University of Delaware, Newark, DE, USA CISC 612: Design extension implementation
  • Constructor Summary

    Constructors
    Constructor
    Description
     
  • Method Summary

    Modifier and Type
    Method
    Description
    static double
    calculateAdjRSquared(double rsq, int n, int k)
    Returns the adjusted R-squared value for a linear regression model.
    static double
    calculateFStat(double rsq, int n, int k)
    Returns the F-statistic for a linear regression model.
    static double
    calculateRSquared(Instances data, double ssr)
    Returns the R-squared value for a linear regression model, where sum of squared residuals is already calculated.
    static double
    calculateSSR(Instances data, Attribute chosen, double slope, double intercept)
    Returns the sum of squared residuals of the simple linear regression model: y = a + bx.
    static double[]
    calculateStdErrorOfCoef(Instances data, boolean[] selected, double ssr, int n, int k)
    Returns an array of the standard errors of the coefficients in a multiple linear regression.
    static double[]
    calculateStdErrorOfCoef(Instances data, Attribute chosen, double slope, double intercept, int df)
    Returns the standard errors of slope and intercept for a simple linear regression model: y = a + bx.
    static double[]
    calculateTStats(double[] coef, double[] stderror, int k)
    Returns an array of the t-statistic of each coefficient in a multiple linear regression model.
    Returns the revision string.

    Methods inherited from class java.lang.Object

    equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Constructor Details

    • RegressionAnalysis

      public RegressionAnalysis()
  • Method Details

    • calculateSSR

      public static double calculateSSR(Instances data, Attribute chosen, double slope, double intercept) throws Exception
      Returns the sum of squared residuals of the simple linear regression model: y = a + bx.
      Parameters:
      data - (the data set)
      chosen - (chosen x-attribute)
      slope - (slope determined by simple linear regression model)
      intercept - (intercept determined by simple linear regression model)
      Returns:
      sum of squared residuals
      Throws:
      Exception - if there is a missing class value in data
    • calculateRSquared

      public static double calculateRSquared(Instances data, double ssr) throws Exception
      Returns the R-squared value for a linear regression model, where sum of squared residuals is already calculated. This works for either a simple or a multiple linear regression model.
      Parameters:
      data - (the data set)
      ssr - (sum of squared residuals)
      Returns:
      R^2 value
      Throws:
      Exception - if there is a missing class value in data
    • calculateAdjRSquared

      public static double calculateAdjRSquared(double rsq, int n, int k)
      Returns the adjusted R-squared value for a linear regression model. This works for either a simple or a multiple linear regression model.
      Parameters:
      rsq - (the model's R-squared value)
      n - (the number of instances in the data)
      k - (the number of coefficients in the model: k>=2)
      Returns:
      the adjusted R squared value
    • calculateFStat

      public static double calculateFStat(double rsq, int n, int k)
      Returns the F-statistic for a linear regression model.
      Parameters:
      rsq - (the model's R-squared value)
      n - (the number of instances in the data)
      k - (the number of coefficients in the model: k>=2)
      Returns:
      F-statistic
    • calculateStdErrorOfCoef

      public static double[] calculateStdErrorOfCoef(Instances data, Attribute chosen, double slope, double intercept, int df) throws Exception
      Returns the standard errors of slope and intercept for a simple linear regression model: y = a + bx. The first element is the standard error of slope, the second element is standard error of intercept.
      Parameters:
      data - (the data set)
      chosen - (chosen x-attribute)
      slope - (slope determined by simple linear regression model)
      intercept - (intercept determined by simple linear regression model)
      df - (number of instances - 2)
      Returns:
      array of standard errors of slope and intercept
      Throws:
      Exception - if there is a missing class value in data
    • calculateStdErrorOfCoef

      public static double[] calculateStdErrorOfCoef(Instances data, boolean[] selected, double ssr, int n, int k) throws Exception
      Returns an array of the standard errors of the coefficients in a multiple linear regression. The last element in the array is the standard error of the constant coefficient. The standard error array is used to calculate the t-statistics.
      Parameters:
      data - (the data set
      selected - (flags indicating variables used in the regression)
      ssr - (sum of squared residuals)
      n - (number of instances)
      k - (number of coefficients; includes constant)
      Returns:
      array of standard errors of coefficients
      Throws:
      Exception - if there is a missing class value in data
    • calculateTStats

      public static double[] calculateTStats(double[] coef, double[] stderror, int k)
      Returns an array of the t-statistic of each coefficient in a multiple linear regression model.
      Parameters:
      coef - (array holding the value of each coefficient)
      stderror - (array holding each coefficient's standard error)
      k - (number of coefficients, includes constant)
      Returns:
      array of t-statistics of coefficients
    • getRevision

      public String getRevision()
      Returns the revision string.
      Returns:
      the revision