Description of minimize_cubic

 Minimize a cubicly regularized quadratic via Newton root finding.

 [y, iter, lambda, status] = minimize_cubic_newton(H, g, sigma, options)

 Inputs: a symmetric matrix H of size n, a nonzero vector g of length n,
 a positive real sigma and an options structure. The code expects H to
 be tridiagonal, stored as a sparse matrix.

 The main output is a vector y of length n, which should minimize

   f(y) = g'*y + (1/2)*y'*H*y + (1/3)*sigma*norm(y)^3.

 This is achieved by reducing the problem to a univariate root finding
 problem, where the unknown is a scalar lambda. This root is computed
 using a Newton method.

 Other outputs are iter (the number of Newton iterations completed),
 lambda (a real scalar, see below) and status. The latter is 0 if the
 target tolerance was reached, 1 if subsequent iterations induce no
 significant change, and -1 if the algorithm return because it reached
 the maximum number of iterations (see the options structure.)
 Non-negative status values are considered successes.

 The options structure must contain the following fields (between
 parentheses are some recommended values):
   options.verbosity (3): to control how much information this function
   prints to the command window. Anything below 6 silences the function.
   options.maxiter_newton (100): maximum number of Newton iterations.
   options.tol_newton (1e-16): tolerance on the root finding accuracy. See
   in code for details.

 The code is based on Section 6 in
 Cartis, Gould and Toint, "Adaptive cubic regularisation methods for
 unconstrained optimization. Part I: motivation, convergence and numerical
 results", Mathematical Programming, 2011.
 https://link.springer.com/article/10.1007/s10107-009-0286-5
 
 Theorem 3.1 in the referenced paper states y is optimal if and only
 if it there exists a real lambda such that
 
 (H + lambda*I)y = -g,  lambda = sigma*||y||  and  H + lambda*I is psd,
 
 where psd means positive semidefinite. The other way around, if we
 find the corresponding scalar lambda, than we can recover y by
 solving a linear system (though this system might not have a unique
 solution in full generality.) Thus, the general strategy is to search
 for lambda rather than for y.

 See also: arc arc_lanczos

0001 function [y, iter, lambda, status] = minimize_cubic_newton(H, g, sigma, options)
0002 % Minimize a cubicly regularized quadratic via Newton root finding.
0003 %
0004 % [y, iter, lambda, status] = minimize_cubic_newton(H, g, sigma, options)
0005 %
0006 % Inputs: a symmetric matrix H of size n, a nonzero vector g of length n,
0007 % a positive real sigma and an options structure. The code expects H to
0008 % be tridiagonal, stored as a sparse matrix.
0009 %
0010 % The main output is a vector y of length n, which should minimize
0011 %
0012 %   f(y) = g'*y + (1/2)*y'*H*y + (1/3)*sigma*norm(y)^3.
0013 %
0014 % This is achieved by reducing the problem to a univariate root finding
0015 % problem, where the unknown is a scalar lambda. This root is computed
0016 % using a Newton method.
0017 %
0018 % Other outputs are iter (the number of Newton iterations completed),
0019 % lambda (a real scalar, see below) and status. The latter is 0 if the
0020 % target tolerance was reached, 1 if subsequent iterations induce no
0021 % significant change, and -1 if the algorithm return because it reached
0022 % the maximum number of iterations (see the options structure.)
0023 % Non-negative status values are considered successes.
0024 %
0025 % The options structure must contain the following fields (between
0026 % parentheses are some recommended values):
0027 %   options.verbosity (3): to control how much information this function
0028 %   prints to the command window. Anything below 6 silences the function.
0029 %   options.maxiter_newton (100): maximum number of Newton iterations.
0030 %   options.tol_newton (1e-16): tolerance on the root finding accuracy. See
0031 %   in code for details.
0032 %
0033 % The code is based on Section 6 in
0034 % Cartis, Gould and Toint, "Adaptive cubic regularisation methods for
0035 % unconstrained optimization. Part I: motivation, convergence and numerical
0036 % results", Mathematical Programming, 2011.
0037 % https://link.springer.com/article/10.1007/s10107-009-0286-5
0038 %
0039 % Theorem 3.1 in the referenced paper states y is optimal if and only
0040 % if it there exists a real lambda such that
0041 %
0042 % (H + lambda*I)y = -g,  lambda = sigma*||y||  and  H + lambda*I is psd,
0043 %
0044 % where psd means positive semidefinite. The other way around, if we
0045 % find the corresponding scalar lambda, than we can recover y by
0046 % solving a linear system (though this system might not have a unique
0047 % solution in full generality.) Thus, the general strategy is to search
0048 % for lambda rather than for y.
0049 %
0050 % See also: arc arc_lanczos
0051 
0052 % This file is part of Manopt: www.manopt.org.
0053 % Original authors: May 1, 2018,
0054 %    Naman Agarwal, Brian Bullins, Nicolas Boumal and Coralia Cartis.
0055 % Contributors:
0056 % Change log:
0057 
0058     n = size(H, 1);
0059     
0060     % Pick an initial lambda that is cheap to compute and that surely makes
0061     % the shifted H positive definite.
0062     lambda = norm(H, 1) + 2;
0063     H_shifted = H + lambda*speye(n);
0064     
0065     % Compute the smallest eigenvalue of H, as we know the target lambda
0066     % must be at least as large as the negative of that, so that the
0067     % shifted H will be positive semidefinite.
0068     %
0069     % Since H ought to be sparse and tridiagonal, and since we only need
0070     % its smallest eigenvalue, this computation could be sped up
0071     % significantly. It does not appear to be a bottleneck, and eig is
0072     % simple and reliable, so we keep this for now.
0073     lambda_min = min(eig(H));
0074     left_barrier = max(0, -lambda_min);
0075     
0076     % Counter 'iter' holds the number of fully executed Newton iterations.
0077     iter = 0;
0078     while true
0079         
0080         if iter >= options.maxiter_newton
0081             % Iterations exceeded maximum number allowed.
0082             status = -1;
0083             return;
0084         end
0085         
0086         % If lambda has the correct value and the shifted H is positive
0087         % definite, then this y is a minimizer.
0088         y = -(H_shifted\g);
0089         ynorm = norm(y);
0090 
0091         % If the following quantity is zero, we have found a solution.
0092         phi = 1/ynorm - sigma/lambda;
0093         
0094         % Check if it is close enough to zero to stop.
0095         if abs(phi) <= options.tol_newton*ynorm
0096             status = 0;
0097             return;
0098         end
0099         psi = ynorm^2;
0100 
0101         % TODO: clarify this part of the code (see referenced paper).
0102         % The following is a Newton type of step on the equation
0103         % sigma/lambda = 1/sqrt(psi(lambda_prev)) ...
0104         %          - (lambda - lambda_prev)((psi'(lambda_prev))/2(psi)^1.5)
0105         delta_y = -(H_shifted\y);
0106         psi_prime = 2*(y'*delta_y);
0107         p0 = 2*sigma*(psi^(1.5));
0108         p1 = -2*psi - lambda*psi_prime;
0109         p2 = psi_prime;
0110         r = roots([p2 p1 p0]);
0111         del_lambda = max(r) - lambda;
0112         iter = iter + 1;
0113 
0114         % If the Newton step would bring us left of the left barrier, jump
0115         % instead to the midpoint between the left barrier and the current
0116         % lambda.
0117         if lambda + del_lambda <= left_barrier
0118             del_lambda = -.5*(lambda - left_barrier);
0119         end
0120 
0121         % If the step is so small that it numerically does not make a
0122         % difference when added to the current lambda, we stop.
0123         if abs(del_lambda) <= eps(lambda)
0124             status = 1;
0125             return;
0126         end
0127 
0128         % Update lambda
0129         H_shifted = H_shifted + del_lambda*speye(n);
0130         lambda = lambda + del_lambda;
0131         
0132         
0133         if options.verbosity >= 6
0134             fprintf(['lambda %.12e, ||y|| %.12e, lambda/sigma %.12e, ' ...
0135                      'phi %.12e\n\n'], lambda, ynorm, lambda / sigma, phi);
0136         end
0137 
0138     end
0139 
0140 end

minimize_cubic_newton

PURPOSE

SYNOPSIS

DESCRIPTION

CROSS-REFERENCE INFORMATION

SOURCE CODE