Home > manopt > solvers > arc > arc.m

# arc

## PURPOSE Adaptive regularization by cubics (ARC) minimization algorithm for Manopt

## SYNOPSIS function [x, cost, info, options] = arc(problem, x, options)

## DESCRIPTION ``` Adaptive regularization by cubics (ARC) minimization algorithm for Manopt

function [x, cost, info, options] = arc(problem)
function [x, cost, info, options] = arc(problem, x0)
function [x, cost, info, options] = arc(problem, x0, options)
function [x, cost, info, options] = arc(problem, [], options)

Apply the ARC minimization algorithm to the problem defined in the
problem structure, starting at x0 if it is provided (otherwise, at a
random point on the manifold). To specify options whilst not specifying
an initial guess, give x0 as [] (the empty matrix).

In most of the examples bundled with the toolbox (see link below), the
solver can be replaced by the present one as is.

With the default subproblem solver (@arc_conjugate_gradient), tuning
parameter options.theta properly appears important for performance.
Users may want to try different values in the range 1e-3 to 1e3 for a
particular application.

The outputs x and cost are the last reached point on the manifold and its
cost. The struct-array info contains information about the iterations:
iter (integer)
The (outer) iteration number, i.e., number of steps considered
so far (whether accepted or rejected). The initial guess is 0.
cost (double)
The corresponding cost value.
The (Riemannian) norm of the gradient.
hessiancalls (integer)
The number of Hessian calls issued by the subproblem solver to
compute this iterate.
time (double)
The total elapsed time in seconds to reach the corresponding cost.
rho (double)
The regularized performance ratio for the iterate.
See options.rho_regularization.
rhonum, rhoden (double)
Numerator and denominator of the performance ratio, before
regularization.
accepted (boolean)
Whether the proposed iterate was accepted or not.
stepsize (double)
The (Riemannian) norm of the vector returned by the subproblem
solver and which is retracted to obtain the proposed next iterate.
If accepted = true for the corresponding iterate, this is the size
of the step from the previous to the new iterate. If accepted is
false, the step was not executed and this is the size of the
rejected step.
sigma (double)
The cubic regularization parameter at the outer iteration.
And possibly additional information logged by options.statsfun or by
the subproblem solver.
For example, type [info.gradnorm] to obtain a vector of the successive
gradient norms reached and [info.time] to obtain a vector with the
corresponding computation times to reach that point.

The options structure is used to overwrite the default values. All
options have a default value and are hence optional. To force an option
value, pass an options structure with a field options.optionname, where
optionname is one of the following. The default value is indicated
between parentheses. The subproblem solver may also accept options.

The algorithm terminates if the norm of the gradient drops below this.
maxiter (1000)
The algorithm terminates if maxiter (outer) iterations have been executed.
maxtime (Inf)
The algorithm terminates if maxtime seconds elapsed.
sigma_0 (100 / trust-regions default maximum radius)
Initial regularization parameter.
sigma_min (1e-7)
Minimum regularization parameter.
eta_1 (0.1)
If rho is below this, the step is unsuccessful (rejected).
eta_2 (0.9)
If rho exceeds this, the step is very successful.
gamma_1 (0.1)
Shrinking factor for regularization parameter if very successful.
gamma_2 (2)
Expansion factor for regularization parameter if unsuccessful.
Function handle to a subproblem solver. The subproblem solver will
also see this options structure, so that parameters can be passed
to it through here as well. Built-in solvers included:
arc_lanczos
rho_regularization (1e3)
See help for the same parameter in the trustregions solver.
statsfun (none)
Function handle to a function that will be called after each
iteration to provide the opportunity to log additional statistics.
They will be returned in the info struct. See the generic Manopt
documentation about solvers for further information.
stopfun (none)
Function handle to a function that will be called at each iteration
to provide the opportunity to specify additional stopping criteria.
See the generic Manopt documentation about solvers for further
information.
verbosity (3)
Integer number used to tune the amount of output the algorithm
generates during execution (mostly as text in the command window).
The higher, the more output. 0 means silent.
storedepth (2)
Maximum number of different points x of the manifold for which a
store structure will be kept in memory in the storedb. If the
caching features of Manopt are not used, this is irrelevant. As of
version 5.0, this is not particularly important.

Please cite the Manopt paper as well as the research paper:
@article{agarwal2018arcfirst,
author  = {Agarwal, N. and Boumal, N. and Bullins, B. and Cartis, C.},
title   = {Adaptive regularization with cubics on manifolds},
journal = {arXiv preprint arXiv:1806.00065},
year    = {2018}
}

## CROSS-REFERENCE INFORMATION This function calls:
• StoreDB
• applyStatsfun Apply the statsfun function to a stats structure (for solvers).
• canGetApproxGradient Checks whether an approximate gradient can be computed for this problem.
• canGetApproxHessian Checks whether an approximate Hessian can be computed for this problem.
• canGetCost Checks whether the cost function can be computed for a problem structure.
• canGetGradient Checks whether the gradient can be computed for a problem structure.
• canGetHessian Checks whether the Hessian can be computed for a problem structure.
• getCostGrad Computes the cost function and the gradient at x in one call if possible.
• getGlobalDefaults Returns a structure with default option values for Manopt.
• mergeOptions Merges two options structures with one having precedence over the other.
• stoppingcriterion Checks for standard stopping criteria, as a helper to solvers.
• arc_conjugate_gradient Subproblem solver for ARC based on a nonlinear conjugate gradient method.
• approxgradientFD Gradient approx. fnctn handle based on finite differences of the cost.
• approxhessianFD Hessian approx. fnctn handle based on finite differences of the gradient.
This function is called by:

## SOURCE CODE ```0001 function [x, cost, info, options] = arc(problem, x, options)
0002 % Adaptive regularization by cubics (ARC) minimization algorithm for Manopt
0003 %
0004 % function [x, cost, info, options] = arc(problem)
0005 % function [x, cost, info, options] = arc(problem, x0)
0006 % function [x, cost, info, options] = arc(problem, x0, options)
0007 % function [x, cost, info, options] = arc(problem, [], options)
0008 %
0009 % Apply the ARC minimization algorithm to the problem defined in the
0010 % problem structure, starting at x0 if it is provided (otherwise, at a
0011 % random point on the manifold). To specify options whilst not specifying
0012 % an initial guess, give x0 as [] (the empty matrix).
0013 %
0014 % In most of the examples bundled with the toolbox (see link below), the
0015 % solver can be replaced by the present one as is.
0016 %
0017 % With the default subproblem solver (@arc_conjugate_gradient), tuning
0018 % parameter options.theta properly appears important for performance.
0019 % Users may want to try different values in the range 1e-3 to 1e3 for a
0020 % particular application.
0021 %
0022 % The outputs x and cost are the last reached point on the manifold and its
0023 % cost. The struct-array info contains information about the iterations:
0024 %   iter (integer)
0025 %       The (outer) iteration number, i.e., number of steps considered
0026 %       so far (whether accepted or rejected). The initial guess is 0.
0027 %   cost (double)
0028 %       The corresponding cost value.
0030 %       The (Riemannian) norm of the gradient.
0031 %   hessiancalls (integer)
0032 %       The number of Hessian calls issued by the subproblem solver to
0033 %       compute this iterate.
0034 %   time (double)
0035 %       The total elapsed time in seconds to reach the corresponding cost.
0036 %   rho (double)
0037 %       The regularized performance ratio for the iterate.
0038 %       See options.rho_regularization.
0039 %   rhonum, rhoden (double)
0040 %       Numerator and denominator of the performance ratio, before
0041 %       regularization.
0042 %   accepted (boolean)
0043 %       Whether the proposed iterate was accepted or not.
0044 %   stepsize (double)
0045 %       The (Riemannian) norm of the vector returned by the subproblem
0046 %       solver and which is retracted to obtain the proposed next iterate.
0047 %       If accepted = true for the corresponding iterate, this is the size
0048 %       of the step from the previous to the new iterate. If accepted is
0049 %       false, the step was not executed and this is the size of the
0050 %       rejected step.
0051 %   sigma (double)
0052 %       The cubic regularization parameter at the outer iteration.
0053 %   And possibly additional information logged by options.statsfun or by
0054 %   the subproblem solver.
0055 % For example, type [info.gradnorm] to obtain a vector of the successive
0056 % gradient norms reached and [info.time] to obtain a vector with the
0057 % corresponding computation times to reach that point.
0058 %
0059 % The options structure is used to overwrite the default values. All
0060 % options have a default value and are hence optional. To force an option
0061 % value, pass an options structure with a field options.optionname, where
0062 % optionname is one of the following. The default value is indicated
0063 % between parentheses. The subproblem solver may also accept options.
0064 %
0066 %       The algorithm terminates if the norm of the gradient drops below this.
0067 %   maxiter (1000)
0068 %       The algorithm terminates if maxiter (outer) iterations have been executed.
0069 %   maxtime (Inf)
0070 %       The algorithm terminates if maxtime seconds elapsed.
0071 %   sigma_0 (100 / trust-regions default maximum radius)
0072 %       Initial regularization parameter.
0073 %   sigma_min (1e-7)
0074 %       Minimum regularization parameter.
0075 %   eta_1 (0.1)
0076 %       If rho is below this, the step is unsuccessful (rejected).
0077 %   eta_2 (0.9)
0078 %       If rho exceeds this, the step is very successful.
0079 %   gamma_1 (0.1)
0080 %       Shrinking factor for regularization parameter if very successful.
0081 %   gamma_2 (2)
0082 %       Expansion factor for regularization parameter if unsuccessful.
0084 %       Function handle to a subproblem solver. The subproblem solver will
0085 %       also see this options structure, so that parameters can be passed
0086 %       to it through here as well. Built-in solvers included:
0087 %           arc_lanczos
0090 %   rho_regularization (1e3)
0091 %       See help for the same parameter in the trustregions solver.
0092 %   statsfun (none)
0093 %       Function handle to a function that will be called after each
0094 %       iteration to provide the opportunity to log additional statistics.
0095 %       They will be returned in the info struct. See the generic Manopt
0096 %       documentation about solvers for further information.
0097 %   stopfun (none)
0098 %       Function handle to a function that will be called at each iteration
0099 %       to provide the opportunity to specify additional stopping criteria.
0100 %       See the generic Manopt documentation about solvers for further
0101 %       information.
0102 %   verbosity (3)
0103 %       Integer number used to tune the amount of output the algorithm
0104 %       generates during execution (mostly as text in the command window).
0105 %       The higher, the more output. 0 means silent.
0106 %   storedepth (2)
0107 %       Maximum number of different points x of the manifold for which a
0108 %       store structure will be kept in memory in the storedb. If the
0109 %       caching features of Manopt are not used, this is irrelevant. As of
0110 %       version 5.0, this is not particularly important.
0111 %
0112 %
0113 % Please cite the Manopt paper as well as the research paper:
0114 % @article{agarwal2018arcfirst,
0115 %   author  = {Agarwal, N. and Boumal, N. and Bullins, B. and Cartis, C.},
0116 %   title   = {Adaptive regularization with cubics on manifolds},
0117 %   journal = {arXiv preprint arXiv:1806.00065},
0118 %   year    = {2018}
0119 % }
0120 %
0121 %
0123
0124 % This file is part of Manopt: www.manopt.org.
0125 % Original authors: May 1, 2018,
0126 %    Naman Agarwal, Brian Bullins, Nicolas Boumal and Coralia Cartis.
0127 % Contributors:
0128 % Change log:
0129 %
0130 %   Aug 14, 2019 (NB):
0131 %       Default subproblem solver for ARC is now arc_conjugate_gradient
0132 %       instead of arc_lanczos. Default gamma_2 changed to 2 from 5.
0133
0134     M = problem.M;
0135
0136     % Verify that the problem description is sufficient for the solver.
0137     if ~canGetCost(problem)
0138         warning('manopt:getCost', ...
0139                 'No cost provided. The algorithm will likely abort.');
0140     end
0142         % Note: we do not give a warning if an approximate gradient is
0143         % explicitly given in the problem description, as in that case the
0144         % user seems to be aware of the issue.
0146                 'Using an FD approximation instead (slow).\n' ...
0147                 'It may be necessary to increase options.tolgradnorm.\n'...
0148                 'To disable this warning: ' ...
0151     end
0152     if ~canGetHessian(problem) && ~canGetApproxHessian(problem)
0153         % Note: we do not give a warning if an approximate Hessian is
0154         % explicitly given in the problem description, as in that case the
0155         % user seems to be aware of the issue.
0156         warning('manopt:getHessian:approx', ['No Hessian provided. ' ...
0157                 'Using an FD approximation instead.\n' ...
0158                 'To disable this warning: ' ...
0159                 'warning(''off'', ''manopt:getHessian:approx'')']);
0160         problem.approxhess = approxhessianFD(problem);
0161     end
0162
0163     % Set local defaults here
0165     localdefaults.maxiter = 1000;
0166     localdefaults.maxtime = inf;
0167     localdefaults.sigma_min = 1e-7;
0168     localdefaults.eta_1 = 0.1;
0169     localdefaults.eta_2 = 0.9;
0170     localdefaults.gamma_1 = 0.1;
0171     localdefaults.gamma_2 = 2;
0172     localdefaults.storedepth = 2;
0174     localdefaults.rho_regularization = 1e3;
0175
0176     % Merge global and local defaults, then merge w/ user options, if any.
0177     localdefaults = mergeOptions(getGlobalDefaults(), localdefaults);
0178     if ~exist('options', 'var') || isempty(options)
0179         options = struct();
0180     end
0181     options = mergeOptions(localdefaults, options);
0182
0183     % Default initial sigma_0 is based on the initial Delta_bar of the
0184     % trustregions solver.
0185     if ~isfield(options, 'sigma_0')
0186         if isfield(M, 'typicaldist')
0187             options.sigma_0 = 100/M.typicaldist();
0188         else
0189             options.sigma_0 = 100/sqrt(M.dim());
0190         end
0191     end
0192
0193
0194     timetic = tic();
0195
0196     % If no initial point x is given by the user, generate one at random.
0197     if ~exist('x', 'var') || isempty(x)
0198         x = M.rand();
0199     end
0200
0201     % Create a store database and get a key for the current x.
0202     storedb = StoreDB(options.storedepth);
0203     key = storedb.getNewKey();
0204
0205     % Compute objective-related quantities for x.
0208
0209     % Initialize regularization parameter.
0210     sigma = options.sigma_0;
0211
0212     % Iteration counter.
0213     % At any point, iter is the number of fully executed iterations so far.
0214     iter = 0;
0215
0216     % Save stats in a struct array info, and preallocate.
0217     stats = savestats(problem, x, storedb, key, options);
0218     info(1) = stats;
0219     info(min(10000, options.maxiter+1)).iter = [];
0220
0221     if options.verbosity >= 2
0222         fprintf(' iter\t\t\t\t\tcost val\t\t grad norm       sigma   #Hess\n');
0223     end
0224
0225     % Iterate until stopping criterion triggers.
0226     while true
0227
0228         % Display iteration information.
0229         if options.verbosity >= 2
0230             fprintf('%5d\t %+.16e\t %.8e   %.2e', ...
0232         end
0233
0234         % Start timing this iteration.
0235         timetic = tic();
0236
0237         % Run standard stopping criterion checks.
0238         [stop, reason] = stoppingcriterion(problem, x, options, ...
0239                                                              info, iter+1);
0240
0241         if stop
0242             if options.verbosity >= 1
0243                 fprintf(['\n' reason '\n']);
0244             end
0245             break;
0246         end
0247
0248         % Solve the ARC subproblem.
0249         % Outputs: eta is the tentative step (it is a tangent vector at x);
0250         % Heta is the result of applying the Hessian at x along eta (this
0251         % is often a natural by-product of the subproblem solver);
0252         % hesscalls is the number of Hessian calls issued in the solver;
0253         % stop_str is a string describing why the solver terminated; and
0254         % substats is some statistics about the solver's work to be logged.
0255         [eta, Heta, hesscalls, stop_str, substats] = ...
0257                                              sigma, options, storedb, key);
0258
0259         etanorm = M.norm(x, eta);
0260
0261         % Get a tentative next x by retracting the proposed step.
0262         newx = M.retr(x, eta);
0263         newkey = storedb.getNewKey();
0264
0265         % Compute the new cost-related quantities for proposal x.
0266         % We could just compute the cost here, as the gradient is only
0267         % necessary if the step is accepted; but we expect most steps are
0268         % accepted, and sometimes the gradient can be computed faster if it
0269         % is computed in conjunction with the cost.
0271
0272         % Compute a regularized ratio between actual and model improvement.
0273         rho_num = cost - newcost;
0274         vec_rho = M.lincomb(x, 1, grad, .5, Heta);
0275         rho_den = -M.inner(x, eta, vec_rho);
0276         rho_reg = options.rho_regularization*eps*max(1, abs(cost));
0277         rho = (rho_num+rho_reg) / (rho_den+rho_reg);
0278
0279         % In principle, the subproblem solver should guarantee rho_den > 0.
0280         % In practice, it may fail, in which case we reject the step.
0281         subproblem_failure = (rho_den+rho_reg <= 0);
0282         if subproblem_failure
0283             stop_str = sprintf( ...
0284                  'SUBPROBLEM FAILURE! (Though it returned: %s)', stop_str);
0285         end
0286
0287         % Determine if the tentative step should be accepted or not.
0288         if rho >= options.eta_1 && ~subproblem_failure
0289             accept = true;
0290             arc_str = 'acc ';
0291             % We accepted this step: erase cache of the previous point.
0292             storedb.removefirstifdifferent(key, newkey);
0293             x = newx;
0294             key = newkey;
0295             cost = newcost;
0298         else
0299             accept = false;
0300             arc_str = 'REJ ';
0301             % We rejected this step: erase cache of the tentative point.
0302             storedb.removefirstifdifferent(newkey, key);
0303         end
0304
0305         % Update the regularization parameter.
0306         if rho >= options.eta_2 && ~subproblem_failure
0307             % Very successful step
0308             arc_str(4) = '-';
0309             if options.gamma_1 > 0
0310                 sigma = max(options.sigma_min, options.gamma_1 * sigma);
0311             else
0312                 sigma = max(options.sigma_min, min(sigma, gradnorm)); % TODO document this
0313             end
0314         elseif rho >= options.eta_1 && ~subproblem_failure
0315             % Successful step
0316             arc_str(4) = ' ';
0317         else
0318             % Unsuccessful step
0319             arc_str(4) = '+';
0320             sigma = options.gamma_2 * sigma;
0321         end
0322
0323         % iter is the number of iterations we have completed.
0324         iter = iter + 1;
0325
0326         % Make sure we don't use too much memory for the store database.
0327         storedb.purge();
0328
0329         % Log statistics for freshly executed iteration.
0330         stats = savestats(problem, x, storedb, key, options);
0331         info(iter+1) = stats;
0332
0333         if options.verbosity >= 2
0334             fprintf('   %5d  %s\n', hesscalls, [arc_str ' ' stop_str]);
0335         end
0336
0337         % When the subproblem solver fails, it would be nice to have an
0338         % alternative, such as a slower but more robust solver. For now, we
0339         % force the solver to terminate when that happens.
0340         if subproblem_failure
0341             if options.verbosity >= 1
0342                 fprintf(['\nThe subproblem solver failed to make ' ...
0343                          'progress even on the model; this is ' ...
0344                          'likely due to numerical errors.\n']);
0345             end
0346             break;
0347         end
0348
0349     end
0350
0351     % Truncate the struct-array to the part we populated
0352     info = info(1:iter+1);
0353
0354     if options.verbosity >= 1
0355         fprintf('Total time is %f [s] (excludes statsfun)\n', info(end).time);
0356     end
0357
0358
0359     % Routine in charge of collecting the current iteration statistics
0360     function stats = savestats(problem, x, storedb, key, options)
0361
0362         stats.iter = iter;
0363         stats.cost = cost;
0365         stats.sigma = sigma;
0366
0367         if iter == 0
0368             stats.hessiancalls = 0;
0369             stats.stepsize = NaN;
0370             stats.time = toc(timetic);
0371             stats.rho = inf;
0372             stats.rhonum = NaN;
0373             stats.rhoden = NaN;
0374             stats.accepted = true;
0375             stats.subproblem = struct();
0376         else
0377             stats.hessiancalls = hesscalls;
0378             stats.stepsize = etanorm;
0379             stats.time = info(iter).time + toc(timetic);
0380             stats.rho = rho;
0381             stats.rhonum = rho_num;
0382             stats.rhoden = rho_den;
0383             stats.accepted = accept;
0384             stats.subproblem = substats;
0385         end
0386
0387         % Similar to statsfun with trustregions: the x and store passed to
0388         % statsfun are that of the most recently accepted point after the
0389         % iteration fully executed.
0390         stats = applyStatsfun(problem, x, storedb, key, options, stats);
0391
0392     end
0393
0394 end```

Generated on Sun 05-Sep-2021 17:57:00 by m2html © 2005