## PURPOSE

Conjugate gradient minimization algorithm for Manopt.

## SYNOPSIS

function [x, cost, info, options] = conjugategradient(problem, x, options)

## DESCRIPTION

``` Conjugate gradient minimization algorithm for Manopt.

function [x, cost, info, options] = conjugategradient(problem)
function [x, cost, info, options] = conjugategradient(problem, x0)
function [x, cost, info, options] = conjugategradient(problem, x0, options)
function [x, cost, info, options] = conjugategradient(problem, [], options)

Apply the conjugate gradient minimization algorithm to the problem
defined in the problem structure, starting at x0 if it is provided
(otherwise, at a random point on the manifold). To specify options whilst
not specifying an initial guess, give x0 as [] (the empty matrix).

The outputs x and cost are the best reached point on the manifold and its
cost. The struct-array info contains information about the iterations:
iter : the iteration number (0 for the initial guess)
cost : cost value
time : elapsed time in seconds
stepsize : norm of the last tangent vector retracted
beta : value of the beta parameter (see options.beta_type)
linesearch : information logged by options.linesearch
And possibly additional information logged by options.statsfun.
For example, type [info.gradnorm] to obtain a vector of the successive

The options structure is used to overwrite the default values. All
options have a default value and are hence optional. To force an option
value, pass an options structure with a field options.optionname, where
optionname is one of the following and the default value is indicated
between parentheses:

The algorithm terminates if the norm of the gradient drops below this.
maxiter (1000)
The algorithm terminates if maxiter iterations have been executed.
maxtime (Inf)
The algorithm terminates if maxtime seconds elapsed.
minstepsize (1e-10)
The algorithm terminates if the linesearch returns a displacement
vector (to be retracted) smaller in norm than this value.
beta_type ('H-S')
Conjugate gradient beta rule used to construct the new search
direction, based on a linear combination of the previous search
direction and the new (preconditioned) gradient. Possible values
for this parameter are:
'S-D', 'steep' for beta = 0 (preconditioned steepest descent)
'F-R' for Fletcher-Reeves's rule
'P-R' for Polak-Ribiere's modified rule
'H-S' for Hestenes-Stiefel's modified rule
'H-Z' for Hager-Zhang's modified rule
'L-S' for Sato's Liu-Storey rule
See Hager and Zhang 2006, "A survey of nonlinear conjugate gradient
methods" for a description of these rules in the Euclidean case and
for an explanation of how to adapt them to the preconditioned case.
The adaption to the Riemannian case is straightforward: see in code
for details. Modified rules take the max between 0 and the computed
beta value, which provides automatic restart, except for H-Z and L-S
which use a different modification. Sato's Liu-Storey rule is
described in Sato 2021, "Riemannian conjugate gradient methods:
General framework and specific algorithms with convergence analyses"
orth_value (Inf)
Following Powell's restart strategy (Math. prog. 1977), restart CG
(that is, make a -preconditioned- gradient step) if two successive
-preconditioned- gradients are "too" parallel. See for example
Hager and Zhang 2006, "A survey of nonlinear conjugate gradient
methods", page 12. An infinite value disables this strategy. See in
code formula for the specific criterion used.
Function handle to a line search function. The options structure is
passed to the line search too, so you can pass it parameters. See
each line search's documentation for info. Another available line
search in manopt is @linesearch, in /manopt/linesearch/linesearch.m
If the problem structure includes a line search hint, then the
default line search used is @linesearch_hint.
statsfun (none)
Function handle to a function that will be called after each
iteration to provide the opportunity to log additional statistics.
They will be returned in the info struct. See the generic Manopt
documentation about solvers for further information.
stopfun (none)
Function handle to a function that will be called at each iteration
to provide the opportunity to specify additional stopping criteria.
See the generic Manopt documentation about solvers for further
information.
verbosity (3)
Integer number used to tune the amount of output the algorithm
generates during execution (mostly as text in the command window).
The higher, the more output. 0 means silent.
storedepth (2)
Maximum number of different points x of the manifold for which a
store structure will be kept in memory in the storedb. If the
caching features of Manopt are not used, this is irrelevant. For
the CG algorithm, a store depth of 2 should always be sufficient.

In most of the examples bundled with the toolbox (see link below), the
solver can be replaced by the present one if need be.

## CROSS-REFERENCE INFORMATION

This function calls:
• StoreDB
• applyStatsfun Apply the statsfun function to a stats structure (for solvers).
• canGetApproxGradient Checks whether an approximate gradient can be computed for this problem.
• canGetCost Checks whether the cost function can be computed for a problem structure.
• canGetGradient Checks whether the gradient can be computed for a problem structure.
• canGetLinesearch Checks whether the problem structure can give a line-search a hint.
• getCostGrad Computes the cost function and the gradient at x in one call if possible.
• getGlobalDefaults Returns a structure with default option values for Manopt.
• getPrecon Applies the preconditioner for the Hessian of the cost at x along d.
• mergeOptions Merges two options structures with one having precedence over the other.
• stoppingcriterion Checks for standard stopping criteria, as a helper to solvers.
• approxgradientFD Gradient approx. fnctn handle based on finite differences of the cost.
• linesearch_adaptive Adaptive line search algorithm (step size selection) for descent methods.
• linesearch_hint Armijo line-search based on the line-search hint in the problem structure.
This function is called by:

## SOURCE CODE

```0001 function [x, cost, info, options] = conjugategradient(problem, x, options)
0002 % Conjugate gradient minimization algorithm for Manopt.
0003 %
0004 % function [x, cost, info, options] = conjugategradient(problem)
0005 % function [x, cost, info, options] = conjugategradient(problem, x0)
0006 % function [x, cost, info, options] = conjugategradient(problem, x0, options)
0007 % function [x, cost, info, options] = conjugategradient(problem, [], options)
0008 %
0009 % Apply the conjugate gradient minimization algorithm to the problem
0010 % defined in the problem structure, starting at x0 if it is provided
0011 % (otherwise, at a random point on the manifold). To specify options whilst
0012 % not specifying an initial guess, give x0 as [] (the empty matrix).
0013 %
0014 % The outputs x and cost are the best reached point on the manifold and its
0015 % cost. The struct-array info contains information about the iterations:
0016 %   iter : the iteration number (0 for the initial guess)
0017 %   cost : cost value
0018 %   time : elapsed time in seconds
0020 %   stepsize : norm of the last tangent vector retracted
0021 %   beta : value of the beta parameter (see options.beta_type)
0022 %   linesearch : information logged by options.linesearch
0023 %   And possibly additional information logged by options.statsfun.
0024 % For example, type [info.gradnorm] to obtain a vector of the successive
0026 %
0027 % The options structure is used to overwrite the default values. All
0028 % options have a default value and are hence optional. To force an option
0029 % value, pass an options structure with a field options.optionname, where
0030 % optionname is one of the following and the default value is indicated
0031 % between parentheses:
0032 %
0034 %       The algorithm terminates if the norm of the gradient drops below this.
0035 %   maxiter (1000)
0036 %       The algorithm terminates if maxiter iterations have been executed.
0037 %   maxtime (Inf)
0038 %       The algorithm terminates if maxtime seconds elapsed.
0039 %   minstepsize (1e-10)
0040 %       The algorithm terminates if the linesearch returns a displacement
0041 %       vector (to be retracted) smaller in norm than this value.
0042 %   beta_type ('H-S')
0043 %       Conjugate gradient beta rule used to construct the new search
0044 %       direction, based on a linear combination of the previous search
0045 %       direction and the new (preconditioned) gradient. Possible values
0046 %       for this parameter are:
0047 %           'S-D', 'steep' for beta = 0 (preconditioned steepest descent)
0048 %           'F-R' for Fletcher-Reeves's rule
0049 %           'P-R' for Polak-Ribiere's modified rule
0050 %           'H-S' for Hestenes-Stiefel's modified rule
0051 %           'H-Z' for Hager-Zhang's modified rule
0052 %           'L-S' for Sato's Liu-Storey rule
0053 %       See Hager and Zhang 2006, "A survey of nonlinear conjugate gradient
0054 %       methods" for a description of these rules in the Euclidean case and
0055 %       for an explanation of how to adapt them to the preconditioned case.
0056 %       The adaption to the Riemannian case is straightforward: see in code
0057 %       for details. Modified rules take the max between 0 and the computed
0058 %       beta value, which provides automatic restart, except for H-Z and L-S
0059 %       which use a different modification. Sato's Liu-Storey rule is
0060 %       described in Sato 2021, "Riemannian conjugate gradient methods:
0061 %       General framework and specific algorithms with convergence analyses"
0062 %   orth_value (Inf)
0063 %       Following Powell's restart strategy (Math. prog. 1977), restart CG
0064 %       (that is, make a -preconditioned- gradient step) if two successive
0065 %       -preconditioned- gradients are "too" parallel. See for example
0066 %       Hager and Zhang 2006, "A survey of nonlinear conjugate gradient
0067 %       methods", page 12. An infinite value disables this strategy. See in
0068 %       code formula for the specific criterion used.
0069 %   linesearch (@linesearch_adaptive or @linesearch_hint)
0070 %       Function handle to a line search function. The options structure is
0071 %       passed to the line search too, so you can pass it parameters. See
0072 %       each line search's documentation for info. Another available line
0073 %       search in manopt is @linesearch, in /manopt/linesearch/linesearch.m
0074 %       If the problem structure includes a line search hint, then the
0075 %       default line search used is @linesearch_hint.
0076 %   statsfun (none)
0077 %       Function handle to a function that will be called after each
0078 %       iteration to provide the opportunity to log additional statistics.
0079 %       They will be returned in the info struct. See the generic Manopt
0080 %       documentation about solvers for further information.
0081 %   stopfun (none)
0082 %       Function handle to a function that will be called at each iteration
0083 %       to provide the opportunity to specify additional stopping criteria.
0084 %       See the generic Manopt documentation about solvers for further
0085 %       information.
0086 %   verbosity (3)
0087 %       Integer number used to tune the amount of output the algorithm
0088 %       generates during execution (mostly as text in the command window).
0089 %       The higher, the more output. 0 means silent.
0090 %   storedepth (2)
0091 %       Maximum number of different points x of the manifold for which a
0092 %       store structure will be kept in memory in the storedb. If the
0093 %       caching features of Manopt are not used, this is irrelevant. For
0094 %       the CG algorithm, a store depth of 2 should always be sufficient.
0095 %
0096 %
0097 % In most of the examples bundled with the toolbox (see link below), the
0098 % solver can be replaced by the present one if need be.
0099 %
0101
0102 % An explicit, general listing of this algorithm, with preconditioning,
0103 % can be found in the following paper:
0104 %     @Article{boumal2015lowrank,
0105 %       Title   = {Low-rank matrix completion via preconditioned optimization on the {G}rassmann manifold},
0106 %       Author  = {Boumal, N. and Absil, P.-A.},
0107 %       Journal = {Linear Algebra and its Applications},
0108 %       Year    = {2015},
0109 %       Pages   = {200--239},
0110 %       Volume  = {475},
0111 %       Doi     = {10.1016/j.laa.2015.02.027},
0112 %     }
0113
0114 % This file is part of Manopt: www.manopt.org.
0115 % Original author: Bamdev Mishra, Dec. 30, 2012.
0116 % Contributors: Nicolas Boumal, Nick Vannieuwenhoven
0117 % Change log:
0118 %
0119 %   March 14, 2013, NB:
0120 %       Added preconditioner support : see Section 8 in
0121 %       https://www.math.lsu.edu/~hozhang/papers/cgsurvey.pdf
0122 %
0123 %   Sept. 13, 2013, NB:
0124 %       Now logging beta parameter too.
0125 %
0126 %   Nov. 7, 2013, NB:
0127 %       The search direction is no longer normalized before it is passed
0128 %       to the linesearch. This way, it is up to the designers of the
0129 %       linesearch to decide whether they want to use the norm of the
0130 %       search direction in their algorithm or not. There are reasons
0131 %       against it, but practical evidence that it may help too, so we
0132 %       allow it. The default linesearch_adaptive used does exploit the
0133 %       norm information. The base linesearch does not. You may select it
0134 %       by setting options.linesearch = @linesearch;
0135 %
0136 %   Nov. 29, 2013, NB:
0137 %       Documentation improved: options are now explicitly described.
0138 %       Removed the Daniel rule for beta: it was not appropriate for
0139 %       preconditioned CG and I could not find a proper reference for it.
0140 %
0141 %   April 3, 2015 (NB):
0142 %       Works with the new StoreDB class system.
0143 %
0144 %   Aug. 2, 2018 (NB):
0145 %       Now using storedb.remove() to keep the cache lean.
0146 %
0147 %   Feb. 7, 2022 (NV):
0148 %       Added support for Liu-Storey rule (L-S).
0149
0150 M = problem.M;
0151
0152 % Verify that the problem description is sufficient for the solver.
0153 if ~canGetCost(problem)
0154     warning('manopt:getCost', ...
0155         'No cost provided. The algorithm will likely abort.');
0156 end
0160             'It may be necessary to increase options.tolgradnorm.\n' ...
0161             'To disable this warning: warning(''off'', ''manopt:getGradient:approx'')']);
0163 end
0164
0165 % Set local defaults here
0166 localdefaults.minstepsize = 1e-10;
0167 localdefaults.maxiter = 1000;
0169 localdefaults.storedepth = 20;
0170 % Changed by NB : H-S has the "auto restart" property.
0171 % See Hager-Zhang 2005/2006 survey about CG methods.
0172 % The auto restart comes from the 'max(0, ...)', not so much from the
0173 % reason stated in Hager-Zhang I think. P-R also has auto restart.
0174 localdefaults.beta_type = 'H-S';
0175 localdefaults.orth_value = Inf; % by BM as suggested in Nocedal and Wright
0176
0177
0178 % Depending on whether the problem structure specifies a hint for
0179 % line-search algorithms, choose a default line-search that works on
0180 % its own (typical) or that uses the hint.
0181 if ~canGetLinesearch(problem)
0183 else
0184     localdefaults.linesearch = @linesearch_hint;
0185 end
0186
0187 % Merge global and local defaults, then merge w/ user options, if any.
0188 localdefaults = mergeOptions(getGlobalDefaults(), localdefaults);
0189 if ~exist('options', 'var') || isempty(options)
0190     options = struct();
0191 end
0192 options = mergeOptions(localdefaults, options);
0193
0194 timetic = tic();
0195
0196 % If no initial point x is given by the user, generate one at random.
0197 if ~exist('x', 'var') || isempty(x)
0198     x = M.rand();
0199 end
0200
0201 % Create a store database and generate a key for the current x
0202 storedb = StoreDB(options.storedepth);
0203 key = storedb.getNewKey();
0204
0205 % Compute cost-related quantities for x
0210
0211 % Iteration counter (at any point, iter is the number of fully executed
0212 % iterations so far)
0213 iter = 0;
0214
0215 % Save stats in a struct array info and preallocate.
0216 stats = savestats();
0217 info(1) = stats;
0218 info(min(10000, options.maxiter+1)).iter = [];
0219
0220
0221 if options.verbosity >= 2
0222     fprintf(' iter\t               cost val\t    grad. norm\n');
0223 end
0224
0225 % Compute a first descent direction (not normalized)
0226 desc_dir = M.lincomb(x, -1, Pgrad);
0227
0228
0229 % Start iterating until stopping criterion triggers
0230 while true
0231
0232     % Display iteration information
0233     if options.verbosity >= 2
0235     end
0236
0237     % Start timing this iteration
0238     timetic = tic();
0239
0240     % Run standard stopping criterion checks
0241     [stop, reason] = stoppingcriterion(problem, x, options, info, iter+1);
0242
0243     % Run specific stopping criterion check
0244     if ~stop && abs(stats.stepsize) < options.minstepsize
0245         stop = true;
0246         reason = sprintf(['Last stepsize smaller than minimum '  ...
0247                           'allowed; options.minstepsize = %g.'], ...
0248                           options.minstepsize);
0249     end
0250
0251     if stop
0252         if options.verbosity >= 1
0253             fprintf([reason '\n']);
0254         end
0255         break;
0256     end
0257
0258
0259     % The line search algorithms require the directional derivative of the
0260     % cost at the current point x along the search direction.
0261     df0 = M.inner(x, grad, desc_dir);
0262
0263     % If we didn't get a descent direction: restart, i.e., switch to the
0264     % negative gradient. Equivalent to resetting the CG direction to a
0265     % steepest descent step, which discards the past information.
0266     if df0 >= 0
0267
0268         % Or we switch to the negative gradient direction.
0269         if options.verbosity >= 3
0270             fprintf(['Conjugate gradient info: got an ascent direction '...
0271                      '(df0 = %2e), reset to the (preconditioned) '...
0272                      'steepest descent direction.\n'], df0);
0273         end
0275         desc_dir = M.lincomb(x, -1, Pgrad);
0277
0278     end
0279
0280
0281     % Execute line search
0282     [stepsize, newx, newkey, lsstats] = options.linesearch( ...
0283                    problem, x, desc_dir, cost, df0, options, storedb, key);
0284
0285
0286     % Compute the new cost-related quantities for newx
0291
0292
0293     % Apply the CG scheme to compute the next search direction.
0294     %
0295     % This paper https://www.math.lsu.edu/~hozhang/papers/cgsurvey.pdf
0296     % by Hager and Zhang lists many known beta rules. The rules defined
0297     % here can be found in that paper (or are provided with additional
0298     % references), adapted to the Riemannian setting.
0299     %
0300     if strcmpi(options.beta_type, 'steep') || ...
0301        strcmpi(options.beta_type, 'S-D')              % Gradient Descent
0302
0303         beta = 0;
0304         desc_dir = M.lincomb(newx, -1, Pnewgrad);
0305
0306     else
0307
0310
0311         % Powell's restart strategy (see page 12 of Hager and Zhang's
0312         % survey on conjugate gradient methods, for example)
0314             beta = 0;
0315             desc_dir = M.lincomb(x, -1, Pnewgrad);
0316
0317         else % Compute the CG modification
0318
0319             desc_dir = M.transp(x, newx, desc_dir);
0320
0321             switch upper(options.beta_type)
0322
0323                 case 'F-R'  % Fletcher-Reeves
0325
0326                 case 'P-R'  % Polak-Ribiere+
0329                     ip_diff = M.inner(newx, Pnewgrad, diff);
0331                     beta = max(0, beta);
0332
0333                 case 'H-S'  % Hestenes-Stiefel+
0335                     ip_diff = M.inner(newx, Pnewgrad, diff);
0336                     beta = ip_diff / M.inner(newx, diff, desc_dir);
0337                     beta = max(0, beta);
0338
0339                 case 'H-Z' % Hager-Zhang+
0343                     deno = M.inner(newx, diff, desc_dir);
0344                     numo = M.inner(newx, diff, Pnewgrad);
0345                     numo = numo - 2*M.inner(newx, diff, Pdiff)*...
0346                                      M.inner(newx, desc_dir, newgrad) / deno;
0347                     beta = numo / deno;
0348
0349                     % Robustness (see Hager-Zhang paper mentioned above)
0350                     desc_dir_norm = M.norm(newx, desc_dir);
0351                     eta_HZ = -1 / ( desc_dir_norm * min(0.01, gradnorm) );
0352                     beta = max(beta, eta_HZ);
0353
0354                 case 'L-S' % Liu-Storey+ from Sato
0356                     ip_diff = M.inner(newx, Pnewgrad, diff);
0357                     denom = -1*M.inner(x, grad, desc_dir);
0358                     betaLS = ip_diff / denom;
0360                     beta = max(0, min(betaLS, betaCD));
0361
0362                 otherwise
0363                     error(['Unknown options.beta_type. ' ...
0364                            'Should be steep, S-D, F-R, P-R, H-S, H-Z, or L-S.']);
0365             end
0366
0367             desc_dir = M.lincomb(newx, -1, Pnewgrad, beta, desc_dir);
0368
0369         end
0370
0371     end
0372
0373     % Transfer iterate info.
0374     storedb.removefirstifdifferent(key, newkey);
0375     x = newx;
0376     key = newkey;
0377     cost = newcost;
0382
0383     % iter is the number of iterations we have accomplished.
0384     iter = iter + 1;
0385
0386     % Make sure we don't use too much memory for the store database.
0387     storedb.purge();
0388
0389     % Log statistics for freshly executed iteration.
0390     stats = savestats();
0391     info(iter+1) = stats;
0392
0393 end
0394
0395
0396 info = info(1:iter+1);
0397
0398 if options.verbosity >= 1
0399     fprintf('Total time is %f [s] (excludes statsfun)\n', info(end).time);
0400 end
0401
0402
0403 % Routine in charge of collecting the current iteration stats
0404 function stats = savestats()
0405     stats.iter = iter;
0406     stats.cost = cost;
0408     if iter == 0
0409         stats.stepsize = nan;
0410         stats.time = toc(timetic);
0411         stats.linesearch = [];
0412         stats.beta = 0;
0413     else
0414         stats.stepsize = stepsize;
0415         stats.time = info(iter).time + toc(timetic);
0416         stats.linesearch = lsstats;
0417         stats.beta = beta;
0418     end
0419     stats = applyStatsfun(problem, x, storedb, key, options, stats);
0420 end
0421
0422 end
0423
0424```

Generated on Fri 30-Sep-2022 13:18:25 by m2html © 2005