Home > manopt > solvers > arc > arc_gradient_descent.m

## PURPOSE

Subproblem solver for ARC based on gradient descent.

## DESCRIPTION

``` Subproblem solver for ARC based on gradient descent.

[eta, Heta, hesscalls, stop_str, stats] =

This routine approximately solves the following problem:

min_{eta in T_x M}  m(eta),  where

m(eta) = <eta, g> + .5 <eta, H[eta]> + (sigma/3) ||eta||^3

where eta is a tangent vector at x on the manifold given by problem.M,
g = grad is a tangent vector at x, H[eta] is the result of applying the
Hessian of the problem at x along eta and the inner product and norm
are those from the Riemannian structure on the tangent space T_x M.

The solve is approximate in the sense that the returned eta only ought
to satisfy the following conditions:

||gradient of m at eta|| <= theta*||eta||^2   and   m(eta) <= m(0),

where theta is specified in options.theta (see below for default value.)
Since the gradient of the model at 0 is g, if it is zero, then eta = 0
is returned. This is the only scenario where eta = 0 is returned.

Numerical errors can perturb the described expected behavior.

Inputs:
problem: Manopt optimization problem structure
x: point on the manifold problem.M
sigma: cubic regularization parameter (positive scalar)
options: structure containing options for the subproblem solver
storedb, key: caching data for problem at x

Options specific to this subproblem solver:
theta (0.5)
Stopping criterion parameter for subproblem solver: the gradient of
the model at the returned step should have norm no more than theta
times the squared norm of the step.
maxinner (100)
Maximum number of iterations of the gradient descent algorithm.

Outputs:
eta: approximate solution to the cubic regularized subproblem at x
Heta: Hess f(x)[eta] -- this is necessary in the outer loop, and it
is often naturally available to the subproblem solver at the
end of execution, so that it may be cheaper to return it here.
hesscalls: number of Hessian calls during execution
stop_str: string describing why the subsolver stopped
stats: a structure specifying some statistics about inner work -
we record the model cost value and model gradient norm at each
inner iteration.```

## CROSS-REFERENCE INFORMATION

This function calls:
• getHessian Computes the Hessian of the cost function at x along d.
• mergeOptions Merges two options structures with one having precedence over the other.
• solve_along_line Minimize the function h(eta) = f(x + eta*y) where
This function is called by:

## SOURCE CODE

```0001 function [eta, Heta, hesscalls, stop_str, stats] = arc_gradient_descent(problem, x, grad, gradnorm, sigma, options, storedb, key)
0002 % Subproblem solver for ARC based on gradient descent.
0003 %
0004 % [eta, Heta, hesscalls, stop_str, stats] =
0006 %
0007 % This routine approximately solves the following problem:
0008 %
0009 %   min_{eta in T_x M}  m(eta),  where
0010 %
0011 %       m(eta) = <eta, g> + .5 <eta, H[eta]> + (sigma/3) ||eta||^3
0012 %
0013 % where eta is a tangent vector at x on the manifold given by problem.M,
0014 % g = grad is a tangent vector at x, H[eta] is the result of applying the
0015 % Hessian of the problem at x along eta and the inner product and norm
0016 % are those from the Riemannian structure on the tangent space T_x M.
0017 %
0018 % The solve is approximate in the sense that the returned eta only ought
0019 % to satisfy the following conditions:
0020 %
0021 %   ||gradient of m at eta|| <= theta*||eta||^2   and   m(eta) <= m(0),
0022 %
0023 % where theta is specified in options.theta (see below for default value.)
0024 % Since the gradient of the model at 0 is g, if it is zero, then eta = 0
0025 % is returned. This is the only scenario where eta = 0 is returned.
0026 %
0027 % Numerical errors can perturb the described expected behavior.
0028 %
0029 % Inputs:
0030 %     problem: Manopt optimization problem structure
0031 %     x: point on the manifold problem.M
0032 %     grad: gradient of the cost function of the problem at x
0033 %     gradnorm: norm of the gradient, often available to the caller
0034 %     sigma: cubic regularization parameter (positive scalar)
0035 %     options: structure containing options for the subproblem solver
0036 %     storedb, key: caching data for problem at x
0037 %
0038 % Options specific to this subproblem solver:
0039 %   theta (0.5)
0040 %     Stopping criterion parameter for subproblem solver: the gradient of
0041 %     the model at the returned step should have norm no more than theta
0042 %     times the squared norm of the step.
0043 %   maxinner (100)
0044 %     Maximum number of iterations of the gradient descent algorithm.
0045 %
0046 % Outputs:
0047 %     eta: approximate solution to the cubic regularized subproblem at x
0048 %     Heta: Hess f(x)[eta] -- this is necessary in the outer loop, and it
0049 %           is often naturally available to the subproblem solver at the
0050 %           end of execution, so that it may be cheaper to return it here.
0051 %     hesscalls: number of Hessian calls during execution
0052 %     stop_str: string describing why the subsolver stopped
0053 %     stats: a structure specifying some statistics about inner work -
0054 %            we record the model cost value and model gradient norm at each
0055 %            inner iteration.
0056
0057 % This file is part of Manopt: www.manopt.org.
0058 % Original authors: May 2, 2019,
0059 %    Bryan Zhu, Nicolas Boumal.
0060 % Contributors:
0061 % Change log:
0062 %
0063 %   Aug. 19, 2019 (NB):
0064 %       Option maxiter_gradient renamed to maxinner to match trustregions.
0065
0066     % Some shortcuts
0067     M = problem.M;
0068     inner = @(u, v) M.inner(x, u, v);
0069     rnorm = @(u) M.norm(x, u);
0070     tangent = @(u) problem.M.tangent(x, u);
0071     Hess = @(u) getHessian(problem, x, u, storedb, key);
0072
0073     % Counter for Hessian calls issued
0074     hesscalls = 0;
0075
0076     % If the gradient has norm zero, return a zero step
0078         eta = M.zerovec(x);
0079         Heta = eta;
0080         stop_str = 'Cost gradient is zero';
0081         stats = struct('gradnorms', 0, 'func_values', 0);
0082         return;
0083     end
0084
0085     % Set local defaults here
0086     localdefaults.theta = 0.5;
0087     localdefaults.maxinner = 100;
0088
0089     % Merge local defaults with user options, if any
0090     if ~exist('options', 'var') || isempty(options)
0091         options = struct();
0092     end
0093     options = mergeOptions(localdefaults, options);
0094
0095     % Calculate the Cauchy point as our initial step
0097     hesscalls = hesscalls + 1;
0099     R_c = -temp + sqrt(temp * temp + gradnorm / sigma);
0102
0103     % Main gradient descent iteration
0105     func_values = zeros(options.maxinner, 1);
0107     j = 1;
0108     while j < options.maxinner
0109         % Calculate the gradient of the model
0110         eta_norm = rnorm(eta);
0114
0115         % Compute some statistics
0117         func_values(j) = inner(grad, eta) + 0.5 * inner(eta, Heta) + (sigma/3) * eta_norm^3;
0118
0119         % Check termination condition
0120         if rnorm(neg_mgrad) <= options.theta * eta_norm^2
0121             stop_str = 'Model grad norm condition satisfied';
0123             break;
0124         end
0125
0126         % Find the optimal step in the negative direction of the gradient
0128         hesscalls = hesscalls + 1;
0130         if step == 0
0131             stop_str = 'Failed optimal increase';
0133             break;
0134         end
0135         eta = M.lincomb(x, 1, eta, step, neg_mgrad);
0136         Heta = M.lincomb(x, 1, Heta, step, Hnmgrad);
0137         j = j + 1;
0138     end
0139
0140     % Check why we stopped iterating
0142         stop_str = sprintf(['Reached max number of gradient descent iterations ' ...
0143                '(options.maxinner = %d)'], options.maxinner);
0144         j = j - 1;
0145     end
0146
0147     % Return the point we ended on
0148     eta = tangent(eta);