public abstract class Optimization extends java.lang.Object implements TechnicalInformationHandler, RevisionHandler
Here is the sketch of our searching strategy, and the detailed description of the algorithm can be found in the Appendix of Xin Xu's MSc thesis:
Initialize everything, incl. initial value, direction, etc.
LOOP (main algorithm):
1. Perform the line search using the directions for free variables
1.1 Check all the bounds that are not "active" (i.e. binding variables)
and compute the feasible step length to the bound for each of them
1.2 Pick up the least feasible step length, say \alpha, and set it as
the upper bound of the current step length, i.e.
0<\lambda<=\alpha
1.3 Search for any possible step length<=\alpha that can result the
"sufficient function decrease" (\alpha condition) AND "positive
definite inverse Hessian" (\beta condition), if possible, using
SAFEGUARDED polynomial interpolation. This step length is "safe" and
thus is used to compute the next value of the free variables .
1.4 Fix the variable(s) that are newly bound to its constraint(s).
2. Check whether there is convergence of all variables or their gradients. If there is, check the possibilities to release any current bindings of the fixed variables to their bounds based on the "reliable" secondorder Lagarange multipliers if available. If it's available and negative for one variable, then release it. If not available, use firstorder Lagarange multiplier to test release. If there is any released variables, STOP the loop. Otherwise update the inverse of Hessian matrix and gradient for the newly released variables and CONTINUE LOOP.
3. Use BFGS formula to update the inverse of Hessian matrix. Note the alreadyfixed variables must have zeros in the corresponding entries in the inverse Hessian.
4. Compute the new (newton) search direction d=H^{1}*g, where H^{1} is the inverse Hessian and g is the Jacobian. Note that again, the already fixed variables will have zero direction.
ENDLOOP
A typical usage of this class is to create your own subclass of this class and provide the objective function and gradients as follows:
class MyOpt extends Optimization { // Provide the objective function protected double objectiveFunction(double[] x) { // How to calculate your objective function... // ... } // Provide the first derivatives protected double[] evaluateGradient(double[] x) { // How to calculate the gradient of the objective function... // ... } // If possible, provide the index^{th} row of the Hessian matrix protected double[] evaluateHessian(double[] x, int index) { // How to calculate the index^th variable's second derivative // ... } }When it's the time to use it, in some routine(s) of other class...
MyOpt opt = new MyOpt(); // Set up initial variable values and bound constraints double[] x = new double[numVariables]; // Lower and upper bounds: 1st row is lower bounds, 2nd is upper double[] constraints = new double[2][numVariables]; ... // Find the minimum, 200 iterations as default x = opt.findArgmin(x, constraints); while(x == null){ // 200 iterations are not enough x = opt.getVarbValues(); // Try another 200 iterations x = opt.findArgmin(x, constraints); } // The minimal function value double minFunction = opt.getMinFunction(); ...It is recommended that Hessian values be provided so that the secondorder Lagrangian multiplier estimate can be calcluated. However, if it is not provided, there is no need to override the
evaluateHessian()
function.
REFERENCES (see also the getTechnicalInformation()
method):
The whole model algorithm is adapted from Chapter 5 and other related
chapters in Gill, Murray and Wright(1981) "Practical Optimization", Academic
Press. and Gill and Murray(1976) "Minimization Subject to Bounds on the
Variables", NPL Report NAC72, while Chong and Zak(1996) "An Introduction to
Optimization", John Wiley & Sons, Inc. provides us a brief but helpful
introduction to the method.
Dennis and Schnabel(1983) "Numerical Methods for Unconstrained Optimization and Nonlinear Equations", PrenticeHall Inc. and Press et al.(1992) "Numeric Recipe in C", Second Edition, Cambridge University Press. are consulted for the polynomial interpolation used in the line search implementation.
The Hessian modification in BFGS update uses Cholesky factorization and two
rankone modifications:
Bk+1 = Bk + (Gk*Gk')/(Gk'Dk) + (dGk*(dGk)'))/[alpha*(dGk)'*Dk].
where Gk is the gradient vector, Dk is the direction vector and alpha is the
step rate.
This method is due to Gill, Golub, Murray and Saunders(1974) ``Methods for
Modifying Matrix Factorizations'', Mathematics of Computation, Vol.28,
No.126, pp 505535.
getTechnicalInformation()
Constructor and Description 

Optimization() 
Modifier and Type  Method and Description 

double[] 
findArgmin(double[] initX,
double[][] constraints)
Main algorithm.

double 
getMinFunction()
Get the minimal function value

TechnicalInformation 
getTechnicalInformation()
Returns an instance of a TechnicalInformation object, containing
detailed information about the technical background of this class,
e.g., paper reference or book this class is based on.

double[] 
getVarbValues()
Get the variable values.

double[] 
lnsrch(double[] xold,
double[] gradient,
double[] direct,
double stpmax,
boolean[] isFixed,
double[][] nwsBounds,
weka.core.Optimization.DynamicIntArray wsBdsIndx)
Find a new point x in the direction p from a point xold at which the
value of the function has decreased sufficiently, the positive
definiteness of B matrix (approximation of the inverse of the Hessian)
is preserved and no bound constraints are violated.

void 
setDebug(boolean db)
Set whether in debug mode

void 
setMaxIteration(int it)
Set the maximal number of iterations in searching (Default 200)

static double[] 
solveTriangle(Matrix t,
double[] b,
boolean isLower,
boolean[] isZero)
Solve the linear equation of TX=B where T is a triangle matrix
It can be solved using back/forward substitution, with O(N^2)
complexity

equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
getRevision
public TechnicalInformation getTechnicalInformation()
getTechnicalInformation
in interface TechnicalInformationHandler
public double getMinFunction()
public void setMaxIteration(int it)
it
 the maximal number of iterationspublic void setDebug(boolean db)
db
 use debug or notpublic double[] getVarbValues()
public double[] lnsrch(double[] xold, double[] gradient, double[] direct, double stpmax, boolean[] isFixed, double[][] nwsBounds, weka.core.Optimization.DynamicIntArray wsBdsIndx) throws java.lang.Exception
xold
 old x valuegradient
 gradient at that pointdirect
 direction vectorstpmax
 maximum step lengthisFixed
 indicating whether a variable has been fixednwsBounds
 nonworking set bounds. Means these variables are free and
subject to the bound constraints in this stepwsBdsIndx
 index of variables that has workingset bounds. Means
these variables are already fixed and no longer subject to
the constraintsjava.lang.Exception
 if an error occurspublic double[] findArgmin(double[] initX, double[][] constraints) throws java.lang.Exception
initX
 initial point of x, assuming no value's on the bound!constraints
 the bound constraints of each variable
constraints[0] is the lower bounds and
constraints[1] is the upper boundsjava.lang.Exception
 if an error occurspublic static double[] solveTriangle(Matrix t, double[] b, boolean isLower, boolean[] isZero)
t
 the matrix Tb
 the vector BisLower
 whether T is a lower or higher triangle matrixisZero
 which row(s) of T are not used when solving the equation.
If it's null or all 'false', then every row is used.