Recommended Optimization Solvers
Continuous Optimization
All optimization solvers in Base R have been moved to Appendix I. This page will focus on more modern and state-of-the-art solvers available in packages on CRAN or Github.
Unconstrained optimization
In numerical optimization the goal is to minimize (or maximize) an objective function that depends on real variables: \(\min_x f(x)\). Unconstrained means there are no restrictions on these variables, except possibly box (aka bounds) constraints \(l_i \le x_i \le u_i\) for all \(i\).
ucminf::ucminf
ucminf(par, fn, gr = NULL, ..., control = list(), hessian = 0)
An algorithm for general-purpose unconstrained non-linear optimization. The algorithm is of quasi-Newton type with BFGS updating and soft line searching. It is effective and accurate.
Note: The interface of ucminf
is designed for easy interchange with ‘optim’.
lbfgs::lbfgs
lbfgs(call_eval, call_grad, vars, ...,
max_iterations = 0, ftol = 1e-4,
orthantwise_c = 0 # > 0 to start the OWL-QN algorithm
)
lbfgs
performs function optimization using the L-BFGS and Orthant-Wise Limited-memory Quasi-Newton optimization (OWL-QN) algorithms. It’s a wrapper to the libLBFGS library by Naoaki Okazaki, based on an implementation of the L-BFGS method written by Jorge Nocedal.
Note: There are many more options to set if needed, see the help page. It has no box constraints. The OWL-QN algorithm is particularly suited for higher-dimensional problems (but can be extremely slow).
nloptr::lbfgs
lbfgs(x0, fn, gr = NULL, lower = NULL, upper = NULL,
nl.info = FALSE, control = list(), ...)
varmetric(x0, fn, gr = NULL, rank2 = TRUE, # rank-2|-1 update method
lower = NULL, upper = NULL,
nl.info = FALSE, control = list(), ...)
tnewton(x0, fn, gr = NULL, lower = NULL, upper = NULL,
precond = TRUE, restart = TRUE,
nl.info = FALSE, control = list(), ...)
lbfgs help page | varmetric help page | tnewton help page
Package ‘nloptr’ supports the following quasi-Newton methods:
lbfgs
: Low-storage version of the Broyden-Fletcher-Goldfarb-Shanno (BFGS) method.varmetric
: Shifted limited-memory variable-metric algorithm.tnewton
: Truncated Newton method, using a conjugate-gradient approach.
Relevant options are xto_rel = 1e-6
, maxeval = 1000
, but also stopval = -Inf
or check_derivatives = FALSE
, see nloptr.print.options()
for all options!
Note: Package ‘nloptr’ is a wraper for NLopt, an open-source library for nonlinear optimization.
optimx::optimr
optimr(par, fn, gr=NULL, hess=NULL, method=NULL, lower=-Inf, upper=Inf,
hessian=FALSE, control=list(), ...
)
General-purpose optimization wrapper function that calls other R tools for optimization, including the existing optim
function. Careful reimplementations in pure R of methods originally available in optim
.
The optimr
function supports the following methods:
optimr(method = "nvm")
– variable metric methodoptimr(method = "ncg")
– conjugate gradient method
Note: This package is under development.
Constrained Optimization
There are many different types of constraints. Here we list solvers that are capable of handling general non-linear constraints, equality and inequality constraints. The preferred method for doing this is the ‘augmented Lagrangian’ approach. – wp:Augmented_Lagrangian_method
alabama::auglag
auglag(par, fn, gr, hin, hin.jac, heq, heq.jac,
control.outer=list(), control.optim = list(), ...
)
auglag
implements the Augmented Lagrangian Minimization Algorithm for optimizing smooth nonlinear objective functions with constraints. Linear or nonlinear equality and inequality constraints are allowed.
Note: There is also function constrOptim.nl
, applying an Augmented Lagrangian Adaptive Barrier Minimization Algorithm; can be used to replace constrOptim
in Base R.
NLcOptim::solnl
solnl(X = NULL, objfun = NULL, confun = NULL, A = NULL, B = NULL,
Aeq = NULL, Beq = NULL, lb = NULL, ub = NULL, tolX = 1e-05,
tolFun = 1e-06, tolCon = 1e-06, maxnFun = 1e+07, maxIter = 4000
)
solnl
implements the Sequential Quadratic Programming (SQP) method to find solution for general nonlinear optimization problem (with nonlinear objective and constraint functions). Linear or nonlinear equality and inequality constraints are allowed.
Note: The SQP method is described in detail in the book “Numeric Optimization” by Nocedal and Wright. wp:SQP
BB::spg
spg(par, fn, gr=NULL, method=3, lower=-Inf, upper=Inf,
project=NULL, projectArgs=NULL,
control=list(), quiet=FALSE, alertConvergence=TRUE, ...
)
spg
applies Barzlai-Borwein spectral methods for large-scale nonlinear optimization subject to simple constraints. Based on code from the TANGO project. wp:BB
Note: Use the projection function projectLinear
for equality and inequality constraints A x - b >= 0
.
nloptr::slsqp
slsqp(x0, fn, gr = NULL, lower = NULL, upper = NULL,
hin = NULL, hinjac = NULL, heq = NULL, heqjac = NULL,
nl.info = FALSE, control = list(), ...
)
cobyla(x0, fn, lower = NULL, upper = NULL, hin = NULL,
nl.info = FALSE, control = list(), ...
)
slsqp
help page, cobyla
help page, and NLopt Documentation
slsqp
implements a Sequential (least-squares) quadratic programming (SQP) algorithm for nonlinearly constrained, gradient-based optimization, supporting both equality and inequality constraints.
cobyla
implements an algorithm for derivative-free optimization with nonlinear inequality constraints (no equality constraints). It constructs successive linear approximations and constraints via simplices.
Quadratic Optimization
Quadratic optimization (or: Quadratic Programming, QP) problems are optimization problems of the form \(\min_x 0.5\, x'Px + q' x\) with linear or quadratic constraints. These problems are ubiquiteous and appear in almost all data fitting and least-squares approximation tasks. – wp:Quadratic programming
osqp::solve_osqp
solve_osqp(P = NULL, q = NULL,
A = NULL, l = NULL, u = NULL,
pars = osqpSettings()
)
solve_osqp
solves Quadratic Programming (QP) problems with linear inequality constraints, using the OSQP library of the University of Oxford. P
and q
as above, linear inequalities as l <= Ax <= u
, A
a (sparse) matrix.
piqp::solve_piqp
solve_piqp(P = NULL, c = NULL,
A = NULL, b = NULL, G = NULL, h = NULL,
x_lb = NULL, x_ub = NULL,
settings = list(), backend = c("auto", "sparse", "dense")
)
solve_piqp
solves Quadratic Programming (QP) problems with linear equality and inequality constraints, using the very fast “Proximal Interior Point Quadratic Programming” solver of the EPFL. c
is q
above, equality constraints A x = b
, inequality constraints G x <= h
and box constraints x_lb <= x <= x_ub
.
Least-squares Problems
Least-squares is a standard approach in regression analysis to approximate the solution of linear or nonlinear systems. Nonlinear least-squares problems are usually solved by iterative procedures, quite similar to optimization problems. – wp:Least squares
nlsr::nlsr
nlsr(formula = NULL, data = NULL, start = NULL,
control = NULL, trace = FALSE, subset = NULL,
lower = -Inf, upper = Inf, weights = NULL, ...
)
nlsr
provides solutions to a nonlinear least squares problem using the Nash Marquardt tools, i.e., internally applies the Levenberg-Marquardt algorithm. Conceived as a replacement for nls
.
minpack.lm::nls.lm
nls.lm(par, lower=NULL, upper=NULL, fn, jac = NULL,
control = nls.lm.control(), ...
)
nls.lm
minimizes the sum-of-squares of the vector returned by the function fn, by a modification of the Levenberg-Marquardt algorithm. The user may also provide a function jac
which calculates the Jacobian.
Convex Optimization
A convex optimization problem is one where the objective function and all constraints are convex. There are very effective interior-point methods to solve convex problems quickly and with high accuracy. Convex problems have only one local optimum, i.e., a local solution is also a global solution. – wp:Convex optimization
CVXR:solve
<- length(data)
n <- Variable(n) # initialize n variables
x <- Minimize(p_norm(data - x, 2)) # define the objective function
objective <- list(diff(x) >= 0) # and the list of constraints
constraint <- Problem(objective, constraint) # check the problem for convexity
problem <- solve(problem) # and send it to a convex solver
result $getValue(x) # retrieve the results result
‘CVXR’ is an R package that implements the disciplined convex programming (DCP) approach by Steven Boyd and colleagues to verify the problem’s convexity. It provides an object-oriented modeling language that allows the user to formulate convex optimization problems in a natural mathematical syntax. When the model has been verified, it will be sent to a convex solver such as ECOS or SCS.
Note: The code above is an example of how the modeling language looks like. It solves an isotonic least-squares regression \(x_1 \le x_2 \le \ldots \le x_n\). There is no single solve
function whose usage could be described. See the Tutorial Examples in the documentation.
Remark: There are more convex solvers in R packages (cf. ‘clarabel’, ‘sdpt3r’, ‘cccp’, etc.), their usage can indeed be a quite complex task even for an experienced user.
Derivative-free Optimization
‘Derivative-free’ means numerical optimization without the use of gradients (or hessians). The most popular candidate is Nelder-Mead, a simplex-based direct search method. – wp:Nelder-Mead and wp:Derivative-free optimization
Note: Saying an algorithm is derivative-free does not mean it will minimize nonsmooth objective functions effectively!
dfoptim::nmk
nmk(par, fn, control = list(), ...)
nmkb(par, fn, lower=-Inf, upper=Inf, control = list(), ...)
nmk
is an implementation of Nelder-Mead based on MATLAB code by C.T. Kelley for his book “Iterative Methods for Optimization”. The function nmkb
allows box constraints for Nelder-Mead.
nloptr::neldermead
neldermead(x0, fn, lower = NULL, upper = NULL,
nl.info = FALSE, control = list(), ...
)
An implementation of almost the original Nelder-mead simplex algorithm, but provides explicit support for box constraints. The function is quite fast and accurate.
pracma::anms
anms(fn, x0, ..., tol = 1e-10, maxfeval = NULL)
anms
is an implementation of Nelder-Mead with adaptive parameters, based on ideas of Gao & Han 2012. It is particularly suitable for higher-dimensional problems (20-30 variables).
Global and Stochastic Optimization
Differential Evolution
Differential Evolution (DE) optimizes a problem by maintaining a population of candidate solutions and creating new candidate solutions by combining existing ones according to its simple formulae, and then keeping whichever candidate solution has the best score or fitness. – wp:DE and wp:Global optimization
DEoptim::DEoptim
DEoptim(fn, lower, upper,
control = DEoptim.control(), ..., fnMap=NULL
)
DEoptim
implements the Differential Evolution (DE) algorithm similar to what is described in the book “Differential Evolution – A Practical Approach to Global Optimization” by Price et al. The package relies on an interface to a C implementation of DE, which is effective and fast.
The control options NP
(for population size) and maxiter
should probably be set 5-10 times higher than the default. The recommended strategies are 2
(default) or 3
in DEoptim.control(strategy = ..)
.
DEoptimR::JDEoptim
JDEoptim(lower, upper, fn,
constr = NULL, meq = 0,
NP = 10*length(lower),
maxiter = 200*length(lower), ...
)
JDEoptim
implements in pure R an Differential Evolution (DE) algorithm with “self-adapting control parameters”, as suggested by Brest et al. 2006 (with Java code). It incorporates very general nonlinear (equality and inequality) constraints (though equality constraints seem to be problematic).
Note: It has many more control options than are shown on the usage line. The default values for population size NP
and maxiter
are too small.
Simulated Annealing
Simulated Annealing is a stochastic search technique for global optimization. It is often used for functions with very many local minima. The algorithm is inspired by the annealing process in metallurgy.
– wp:Simulated Annealing
GenSA::GenSA
GenSA(par = NULL, fn, lower, upper, control = list(), ...)
GenSA
searches for global minimum of complicated non-linear objective functions with a large number of optima. It implements a generalized “Simulated Annealing” (SA) approach in C++, that is effective though not always as successful as Differential Evolution.
Genetic Algorithms
Particle-swarm Optimization
Particle-swarm Optimization (PSO) is a heuristic approach that optimizes a problem by iteratively trying to improve a set of particles (candidate solutions) with regard to a given measure of quality. The original idea was to simulate social interactions in a swarm of, e.g., fish or birds when looking for food. – wp:PSO
pso::psoptim
psoptim(par, fn, gr = NULL, ..., lower = -1, upper = 1,
control = list()
)
psoptim
is a general implementation of particle swarm optimization usable as a direct replacement for optim. Control elements include the maximum number of iterations or the swarm size.
Note: This implementation of “Particle Swarm Optimization” is not derivative-free!
CMA-ES
Covariance matrix adaptation evolution strategy (CMA-ES) is a special of evolution strategy for numerical optimization of non-linear or non-convex continuous optimization problems. For more detailed information see – wp:CMA-ES
rCMA::cmaOptimDP
cmaOptimDP(cma, fitFunc, isFeasible = function(x) { TRUE },
maxDimPrint = 5, iterPrint = 10, verbose = 2)
Note: cmaOptimDP
utilizes Java code by Hansen. The package is therefore depending on ‘rJava’ that on some systems may lead to installation problems.
adagio::pureCMAES
pureCMAES(par, fun, lower = NULL, upper = NULL, sigma = 0.5,
stopfitness = -Inf, stopeval = 1000*length(par)^2, ...)
The Octave/MATLAB function pureCMAES
has been converted to R. This function will be used for higher-dimensional search spaces (up to 30-50 variables), though it can become very slow. The results are in general quite good.
Note: There are more implementations of CMA-ES, see the CRAN Optimization Task View.
Linear and Mixed-integer Programming
Linear Programming
linear programming is a technique for the optimization of a linear objective function, subject to linear equality and inequality constraints. – wp:Linear Programming
highs::highs_solve
highs_solve(Q = NULL, L, lower, upper,
A, lhs, rhs, types,maximum = FALSE, offset = 0, control = highs_control()
)
help page and the HiGHS Web page.
‘highs’ wraps the HiGHS solver, currently among the best open source mixed integer linear programming solver. Solves linear and quadratic problems of the form \(0.5 x'Qx + Lx\) with linear and bounds constraints: \(\text{lhs} \le Ax \le \text{rhs}\), \(\text{lower} \le x \le \text{upper}\), and different types of variables (1 or ‘C’ continuous, 2 or ‘I’ integer, 3 or ‘SC’ semi-continuous, etc.).
Note: For binary variables, choose ‘I’ and restrict between 0 and 1. Quadratic problems do not allow for integer variables.
rcbc::cbc_solve
cbc_solve(obj, mat,
row_ub = rep.int(Inf, nrow(mat)), # nrow(mat) no. of constraints
row_lb = rep.int(-Inf, nrow(mat)),
col_lb = rep.int(-Inf, ncol(mat)), # ncol(mat) no. of variables
col_ub = rep.int(Inf, ncol(mat)),
is_integer = rep.int(FALSE, ncol(mat)), # type of variables
max = FALSE,
cbc_args = list(),
initial_solution = NULL
)
CLP and CBC are open-source COIN-OR projects. CLP solves linear programs and CBC adds handling of integer variables. ‘rcbc’ provides an interface for mixed-integer linear programs (without integer variables CLP will be called).
Remark: ‘rcbc’ has to be installed from https://dirkschumacher.github.io/rcbc/
.
Multivariate Root Finding
Finding roots of multivariate function could in principle be done by minimizing the square of the function values. But there are more specialized procedures to perform this task, inspired by Gauss-Newton or spectral approaches.
nleqslv::nleqslv
nleqslv(x, fn, jac=NULL, ...,
method = c("Broyden", "Newton"),
global = c("dbldog", "pwldog", "cline", "qline", "gline", "hook", none"),
xscalm = c("fixed","auto"),
jacobian=FALSE, control = list()
)
Solves a system of nonlinear equations using a Broyden or a Newton method with a choice of global strategies such as line search and trust region. fn
must be a function \(f: \mathbb{R}^n \to \mathbb{R}^n\).
BB::sane
sane(par, fn, method=2, control=list(),
quiet=FALSE, alertConvergence=TRUE, ...
)
dfsane(par, fn, method=2, control=list(),
quiet=FALSE, alertConvergence=TRUE, ...
)
sane
help page and dfsane
help page
sane
provides a non-monotone spectral approach for solving large-scale nonlinear systems of equations, and dfsane
provides a derivative-free version of this spectral approach. The function value must have the same length as the input vector.
Appendix I: R Solvers
Base R Solvers
R provides
optimize
(univariate optimization) andoptim
ornlminb
(multivariate optimization) for function minimization. Besides that, there isnlm
which in general is not recommended.
optimize
optimize(f, interval, ..., lower = min(interval), upper = max(interval),
maximum = FALSE, tol = .Machine$double.eps^0.25
)
Univariate (i.e., onedimensional) optimization aims to find the optimum of a continuous, real-valued function in an interval of the real numbers (with respect to its first argument). optimize
uses a combination of golden section search and successive parabolic interpolation.
Note: When there is more than one local minimum, it is uncertain which one will be chosen.
optim
optim(par, fn, gr = NULL, ..., method = c("Nelder-Mead"),
lower = -Inf, upper = Inf,
control = list(), hessian = FALSE
)
General-purpose multivariate optimization; different methods, Nelder-Mead, BFGS, and SANN are provided.,The gradient function if needed, can be user-supplied or will be calculated with finite differences. It includes options for handling box constraints.
The optim
function supports the following methods:
optim(method="Nelder-Mead")
– Nelder-Mead, derivative-free and robust, but relatively slow.optim(method="BFGS")
– ‘BFGS’ implementation, a quasi-Newton method.optim(method="L-BFGS-B")
– ‘BFGS’ method, allows for box constraintsoptim(method="CG")
conjugate gradients method, not recommended.optim(method="SANN")
variant SA method, not recommended.optim(method="Brent")
– univariate optimization, callsoptimize
, see above.
See the optim Cheat Sheet for more information about optim
and the other optimization solvers in Base R, nlm
and nlminb
.
Note: Default is the derivative-free Nelder-Mead algorithm. If the objective function has derivatives, applying a ‘BFGS’ variant is definitely to be preferred.
nlminb
nlminb(start, objective, gradient = NULL, hessian = NULL, ...,
scale = 1, control = list(), lower = -Inf, upper = Inf
)
Uses the PORT library, a Fortran implementation of quasi-Newton BFGS, for unconstrained and box-constrained optimization.
Note: In general, optim
is to be preferred, or use ucminf
, see above.
constrOptim
constrOptim(theta, f, grad, ui, ci, mu = 1e-04, control = list(),
method = if(is.null(grad)) "Nelder-Mead" else "BFGS",
outer.iterations = 100, outer.eps = 1e-05, ...,
hessian = FALSE
)
Minimise a function subject to linear inequality constraints using an adaptive barrier algorithm. The feasible region is defined by ui %*% theta - ci >= 0
. The starting value must be in the interior of the feasible region, but the minimum may be on the boundary.
nls
nls(formula, data, start, control, algorithm,
trace, subset, weights, na.action, model,
lower, upper, ... )
nls
determines the nonlinear (weighted) least-squares estimates of the parameters of a nonlinear model. Lower and upper bounds can only be used with the port
algorithm. The default is Gauss-Newton, another possibility is plinear
for partially linear least-squares models).
Note: nls
has problems with ‘singular gradients’; instead, use one of the two alternatives, nlsr
or minpack.lm
, listed above.
Appendix II: External Solvers
Ipopt through JuliaCall
The ROI Infrastructure
Utilizing the NEOS Server
References
Boyd, S., and L. Vandenberghe. Convex Optimization. Cambridge University Press, 2004.
URL: https://web.stanford.edu/~boyd/cvxbook/
Nocedal, J., and S.J. Wright. Numerical Optimization. Springer Verlag, New York 1999.
Price, K.V., R.M. Storn and J.A. Lampinen. Differential Evolution – A Practical Approach to Global Optimization. Springer Verlag, Berlin Heidelberg 2005.
Schwendinger, F., and H.W. Borchers (2023). CRAN Task View: Optimization and Mathematical Programming. Version 2023-08-19. URL: https://CRAN.R-project.org/view=Optimization.
Vanderbei. Linear Programming: Foundations and Examples. 5th Edition, Springer Verlag, New York 2020.
Computational Optimization Open Textbook. Cornell University, New York 2020.
URL: https://optimization.cbe.cornell.edu/index.php?title=Main_Page