Home > manopt > solvers > arc > minimize_cubic_newton.m

minimize_cubic_newton

PURPOSE

Minimize a cubicly regularized quadratic via Newton root finding.

SYNOPSIS

function [y, iter, lambda, status] = minimize_cubic_newton(H, g, sigma, options)

DESCRIPTION

``` Minimize a cubicly regularized quadratic via Newton root finding.

[y, iter, lambda, status] = minimize_cubic_newton(H, g, sigma, options)

Inputs: a symmetric matrix H of size n, a nonzero vector g of length n,
a positive real sigma and an options structure. The code expects H to
be tridiagonal, stored as a sparse matrix.

The main output is a vector y of length n, which should minimize

f(y) = g'*y + (1/2)*y'*H*y + (1/3)*sigma*norm(y)^3.

This is achieved by reducing the problem to a univariate root finding
problem, where the unknown is a scalar lambda. This root is computed
using a Newton method.

Other outputs are iter (the number of Newton iterations completed),
lambda (a real scalar, see below) and status. The latter is 0 if the
target tolerance was reached, 1 if subsequent iterations induce no
significant change, and -1 if the algorithm return because it reached
the maximum number of iterations (see the options structure.)
Non-negative status values are considered successes.

The options structure must contain the following fields (between
parentheses are some recommended values):
options.verbosity (3): to control how much information this function
prints to the command window. Anything below 6 silences the function.
options.maxiter_newton (100): maximum number of Newton iterations.
options.tol_newton (1e-16): tolerance on the root finding accuracy. See
in code for details.

The code is based on Section 6 in
Cartis, Gould and Toint, "Adaptive cubic regularisation methods for
unconstrained optimization. Part I: motivation, convergence and numerical
results", Mathematical Programming, 2011.

Theorem 3.1 in the referenced paper states y is optimal if and only
if it there exists a real lambda such that

(H + lambda*I)y = -g,  lambda = sigma*||y||  and  H + lambda*I is psd,

where psd means positive semidefinite. The other way around, if we
find the corresponding scalar lambda, than we can recover y by
solving a linear system (though this system might not have a unique
solution in full generality.) Thus, the general strategy is to search
for lambda rather than for y.

CROSS-REFERENCE INFORMATION

This function calls:
• norm NORM Norm of a TT/MPS tensor.
• norm NORM Norm of a TT/MPS block-mu tensor.
This function is called by:
• arc_lanczos Subproblem solver for ARC based on a Lanczos process.

SOURCE CODE

```0001 function [y, iter, lambda, status] = minimize_cubic_newton(H, g, sigma, options)
0002 % Minimize a cubicly regularized quadratic via Newton root finding.
0003 %
0004 % [y, iter, lambda, status] = minimize_cubic_newton(H, g, sigma, options)
0005 %
0006 % Inputs: a symmetric matrix H of size n, a nonzero vector g of length n,
0007 % a positive real sigma and an options structure. The code expects H to
0008 % be tridiagonal, stored as a sparse matrix.
0009 %
0010 % The main output is a vector y of length n, which should minimize
0011 %
0012 %   f(y) = g'*y + (1/2)*y'*H*y + (1/3)*sigma*norm(y)^3.
0013 %
0014 % This is achieved by reducing the problem to a univariate root finding
0015 % problem, where the unknown is a scalar lambda. This root is computed
0016 % using a Newton method.
0017 %
0018 % Other outputs are iter (the number of Newton iterations completed),
0019 % lambda (a real scalar, see below) and status. The latter is 0 if the
0020 % target tolerance was reached, 1 if subsequent iterations induce no
0021 % significant change, and -1 if the algorithm return because it reached
0022 % the maximum number of iterations (see the options structure.)
0023 % Non-negative status values are considered successes.
0024 %
0025 % The options structure must contain the following fields (between
0026 % parentheses are some recommended values):
0027 %   options.verbosity (3): to control how much information this function
0028 %   prints to the command window. Anything below 6 silences the function.
0029 %   options.maxiter_newton (100): maximum number of Newton iterations.
0030 %   options.tol_newton (1e-16): tolerance on the root finding accuracy. See
0031 %   in code for details.
0032 %
0033 % The code is based on Section 6 in
0034 % Cartis, Gould and Toint, "Adaptive cubic regularisation methods for
0035 % unconstrained optimization. Part I: motivation, convergence and numerical
0036 % results", Mathematical Programming, 2011.
0038 %
0039 % Theorem 3.1 in the referenced paper states y is optimal if and only
0040 % if it there exists a real lambda such that
0041 %
0042 % (H + lambda*I)y = -g,  lambda = sigma*||y||  and  H + lambda*I is psd,
0043 %
0044 % where psd means positive semidefinite. The other way around, if we
0045 % find the corresponding scalar lambda, than we can recover y by
0046 % solving a linear system (though this system might not have a unique
0047 % solution in full generality.) Thus, the general strategy is to search
0048 % for lambda rather than for y.
0049 %
0051
0052 % This file is part of Manopt: www.manopt.org.
0053 % Original authors: May 1, 2018,
0054 %    Naman Agarwal, Brian Bullins, Nicolas Boumal and Coralia Cartis.
0055 % Contributors:
0056 % Change log:
0057
0058     n = size(H, 1);
0059
0060     % Pick an initial lambda that is cheap to compute and that surely makes
0061     % the shifted H positive definite.
0062     lambda = norm(H, 1) + 2;
0063     H_shifted = H + lambda*speye(n);
0064
0065     % Compute the smallest eigenvalue of H, as we know the target lambda
0066     % must be at least as large as the negative of that, so that the
0067     % shifted H will be positive semidefinite.
0068     %
0069     % Since H ought to be sparse and tridiagonal, and since we only need
0070     % its smallest eigenvalue, this computation could be sped up
0071     % significantly. It does not appear to be a bottleneck, and eig is
0072     % simple and reliable, so we keep this for now.
0073     lambda_min = min(eig(H));
0074     left_barrier = max(0, -lambda_min);
0075
0076     % Counter 'iter' holds the number of fully executed Newton iterations.
0077     iter = 0;
0078     while true
0079
0080         if iter >= options.maxiter_newton
0081             % Iterations exceeded maximum number allowed.
0082             status = -1;
0083             return;
0084         end
0085
0086         % If lambda has the correct value and the shifted H is positive
0087         % definite, then this y is a minimizer.
0088         y = -(H_shifted\g);
0089         ynorm = norm(y);
0090
0091         % If the following quantity is zero, we have found a solution.
0092         phi = 1/ynorm - sigma/lambda;
0093
0094         % Check if it is close enough to zero to stop.
0095         if abs(phi) <= options.tol_newton*ynorm
0096             status = 0;
0097             return;
0098         end
0099         psi = ynorm^2;
0100
0101         % TODO: clarify this part of the code (see referenced paper).
0102         % The following is a Newton type of step on the equation
0103         % sigma/lambda = 1/sqrt(psi(lambda_prev)) ...
0104         %          - (lambda - lambda_prev)((psi'(lambda_prev))/2(psi)^1.5)
0105         delta_y = -(H_shifted\y);
0106         psi_prime = 2*(y'*delta_y);
0107         p0 = 2*sigma*(psi^(1.5));
0108         p1 = -2*psi - lambda*psi_prime;
0109         p2 = psi_prime;
0110         r = roots([p2 p1 p0]);
0111         del_lambda = max(r) - lambda;
0112         iter = iter + 1;
0113
0114         % If the Newton step would bring us left of the left barrier, jump
0115         % instead to the midpoint between the left barrier and the current
0116         % lambda.
0117         if lambda + del_lambda <= left_barrier
0118             del_lambda = -.5*(lambda - left_barrier);
0119         end
0120
0121         % If the step is so small that it numerically does not make a
0122         % difference when added to the current lambda, we stop.
0123         if abs(del_lambda) <= eps(lambda)
0124             status = 1;
0125             return;
0126         end
0127
0128         % Update lambda
0129         H_shifted = H_shifted + del_lambda*speye(n);
0130         lambda = lambda + del_lambda;
0131
0132
0133         if options.verbosity >= 6
0134             fprintf(['lambda %.12e, ||y|| %.12e, lambda/sigma %.12e, ' ...
0135                      'phi %.12e\n\n'], lambda, ynorm, lambda / sigma, phi);
0136         end
0137
0138     end
0139
0140 end```

Generated on Fri 30-Sep-2022 13:18:25 by m2html © 2005