Home > examples > using_counters.m

using_counters

PURPOSE ^

Manopt example on how to use counters during optimization. Typical uses,

SYNOPSIS ^

function using_counters()

DESCRIPTION ^

 Manopt example on how to use counters during optimization. Typical uses,
 as demonstrated here, include counting calls to cost, gradient and
 Hessian functions. The example also demonstrates how to record total time
 spent in cost/grad/hess calls iteration by iteration.

 See also: statscounters incrementcounter statsfunhelper

CROSS-REFERENCE INFORMATION ^

This function calls: This function is called by:

SUBFUNCTIONS ^

SOURCE CODE ^

0001 function using_counters()
0002 % Manopt example on how to use counters during optimization. Typical uses,
0003 % as demonstrated here, include counting calls to cost, gradient and
0004 % Hessian functions. The example also demonstrates how to record total time
0005 % spent in cost/grad/hess calls iteration by iteration.
0006 %
0007 % See also: statscounters incrementcounter statsfunhelper
0008 
0009 % This file is part of Manopt: www.manopt.org.
0010 % Original author: Nicolas Boumal, July 27, 2018.
0011 % Contributors:
0012 % Change log:
0013 
0014     rng(0);
0015 
0016     % Setup an optimization problem to illustrate the use of counters
0017     n = 1000;
0018     A = randn(n);
0019     A = .5*(A+A');
0020     
0021     manifold = spherefactory(n);
0022     problem.M = manifold;
0023     
0024     
0025     % Define the problem cost function and its gradient.
0026     %
0027     % Since the most expensive operation in computing the cost and the
0028     % gradient at x is the product A*x, and since this operation is the
0029     % same for both the cost and the gradient, we use the caching
0030     % functionalities of manopt for this product. This function ensures the
0031     % product A*x is available in the store structure. Remember that a
0032     % store structure is associated to a particular point x: if cost and
0033     % egrad are called on the same point x, they will see the same store.
0034     function store = prepare(x, store)
0035         if ~isfield(store, 'Ax')
0036             store.Ax = A*x;
0037             % Increment a counter for the number of matrix-vector products
0038             % involving A. The names of the counters (here, Aproducts) are
0039             % for us to choose: they only need to be valid structure field
0040             % names. They need not have been defined in advance.
0041             store = incrementcounter(store, 'Aproducts');
0042         end
0043     end
0044     %
0045     problem.cost = @cost;
0046     function [f, store] = cost(x, store)
0047         t = tic();
0048         store = prepare(x, store);
0049         f = -.5*(x'*store.Ax);
0050         % Increment a counter for the number of calls to the cost function.
0051         store = incrementcounter(store, 'costcalls');
0052         % We also increment a counter with the amount of time spent in this
0053         % function (all counters are stored as doubles; here we exploit
0054         % this to track a non-integer quantity.)
0055         store = incrementcounter(store, 'functiontime', toc(t));
0056     end
0057     %
0058     problem.egrad = @egrad;
0059     function [g, store] = egrad(x, store)
0060         t = tic();
0061         store = prepare(x, store);
0062         g = -store.Ax;
0063         % Count the number of calls to the gradient function.
0064         store = incrementcounter(store, 'gradcalls');
0065         % We also record time spent in this call, atop the same counter as
0066         % for the cost function.
0067         store = incrementcounter(store, 'functiontime', toc(t));
0068     end
0069     %
0070     problem.ehess = @ehess;
0071     function [h, store] = ehess(x, xdot, store) %#ok<INUSL>
0072         t = tic();
0073         h = -A*xdot;
0074         % Count the number of calls to the Hessian operator and also count
0075         % the matrix-vector product with A.
0076         store = incrementcounter(store, 'hesscalls');
0077         store = incrementcounter(store, 'Aproducts');
0078         % We also record time spent in this call atop the cost and gradient.
0079         store = incrementcounter(store, 'functiontime', toc(t));
0080     end
0081 
0082     
0083     % Setup a callback to log statistics. We use a combination of
0084     % statscounters and of statsfunhelper to indicate which counters we
0085     % want the optimization algorithm to log. Here, stats is a structure
0086     % where each field is a function handle corresponding to one of the
0087     % counters. Before passing stats to statsfunhelper, we could decide to
0088     % add more fields to stats to log other things as well.
0089     stats = statscounters({'costcalls', 'gradcalls', 'hesscalls', ...
0090                            'Aproducts', 'functiontime'});
0091     options.statsfun = statsfunhelper(stats);
0092 
0093     % As an example: we could set up a stopping criterion based on the
0094     % number of matrix-vector products. A short version:
0095     % options.stopfun = @(problem, x, info, last) info(last).Aproducts > 250;
0096     % A longer version that also returns a reason string:
0097     options.stopfun = @stopfun;
0098     function [stop, reason] = stopfun(problem, x, info, last) %#ok<INUSL>
0099         reason = 'Exceeded Aproducts budget.';
0100         stop = (info(last).Aproducts > 250);   % true if budget exceeded
0101         % Here, info(last) contains the stats of the latest iteration.
0102         % That includes all registered counters.
0103     end
0104     
0105     % Solve with different solvers to compare.
0106     options.tolgradnorm = 1e-9;
0107     [x, xcost, infortr] = trustregions(problem, [], options); %#ok<ASGLU>
0108     [x, xcost, inforcg] = conjugategradient(problem, [], options); %#ok<ASGLU>
0109     [x, xcost, infobfg] = rlbfgs(problem, [], options); %#ok<ASGLU>
0110     
0111     
0112     % Display some statistics. The logged data is available in the info
0113     % struct-arrays. Notice how the counters are available by their
0114     % corresponding field name.
0115     figure(1);
0116     subplot(3, 3, 1);
0117     semilogy([infortr.iter], [infortr.gradnorm], '.-', ...
0118              [inforcg.iter], [inforcg.gradnorm], '.-', ...
0119              [infobfg.iter], [infobfg.gradnorm], '.-');
0120     legend('RTR', 'RCG', 'RLBFGS');
0121     xlabel('Iteration #');
0122     ylabel('Gradient norm');
0123     ylim([1e-12, 1e2]); set(gca, 'YTick', [1e-12, 1e-6, 1e0]);
0124     subplot(3, 3, 2);
0125     semilogy([infortr.costcalls], [infortr.gradnorm], '.-', ...
0126              [inforcg.costcalls], [inforcg.gradnorm], '.-', ...
0127              [infobfg.costcalls], [infobfg.gradnorm], '.-');
0128     xlabel('# cost calls');
0129     ylabel('Gradient norm');
0130     ylim([1e-12, 1e2]); set(gca, 'YTick', [1e-12, 1e-6, 1e0]);
0131     subplot(3, 3, 3);
0132     semilogy([infortr.gradcalls], [infortr.gradnorm], '.-', ...
0133              [inforcg.gradcalls], [inforcg.gradnorm], '.-', ...
0134              [infobfg.gradcalls], [infobfg.gradnorm], '.-');
0135     xlabel('# gradient calls');
0136     ylabel('Gradient norm');
0137     ylim([1e-12, 1e2]); set(gca, 'YTick', [1e-12, 1e-6, 1e0]);
0138     subplot(3, 3, 4);
0139     semilogy([infortr.hesscalls], [infortr.gradnorm], '.-', ...
0140              [inforcg.hesscalls], [inforcg.gradnorm], '.-', ...
0141              [infobfg.hesscalls], [infobfg.gradnorm], '.-');
0142     xlabel('# Hessian calls');
0143     ylabel('Gradient norm');
0144     ylim([1e-12, 1e2]); set(gca, 'YTick', [1e-12, 1e-6, 1e0]);
0145     subplot(3, 3, 5);
0146     semilogy([infortr.Aproducts], [infortr.gradnorm], '.-', ...
0147              [inforcg.Aproducts], [inforcg.gradnorm], '.-', ...
0148              [infobfg.Aproducts], [infobfg.gradnorm], '.-');
0149     xlabel('# matrix-vector products');
0150     ylabel('Gradient norm');
0151     ylim([1e-12, 1e2]); set(gca, 'YTick', [1e-12, 1e-6, 1e0]);
0152     subplot(3, 3, 6);
0153     semilogy([infortr.time], [infortr.gradnorm], '.-', ...
0154              [inforcg.time], [inforcg.gradnorm], '.-', ...
0155              [infobfg.time], [infobfg.gradnorm], '.-');
0156     xlabel('Computation time [s]');
0157     ylabel('Gradient norm');
0158     ylim([1e-12, 1e2]); set(gca, 'YTick', [1e-12, 1e-6, 1e0]);
0159     subplot(3, 3, 7);
0160     semilogy([infortr.functiontime], [infortr.gradnorm], '.-', ...
0161              [inforcg.functiontime], [inforcg.gradnorm], '.-', ...
0162              [infobfg.functiontime], [infobfg.gradnorm], '.-');
0163     xlabel('Time spent in cost/grad/hess [s]');
0164     ylabel('Gradient norm');
0165     ylim([1e-12, 1e2]); set(gca, 'YTick', [1e-12, 1e-6, 1e0]);
0166     % The following plot allows to investigate what fraction of the time is
0167     % spent inside user-supplied function (cost/grad/hess) versus the total
0168     % time spent by the solver. This gives a sense of the relative
0169     % importance of cost function-related computational costs vs a solver's
0170     % inner workings, retractions, and other solver-specific operations.
0171     subplot(3, 3, 8);
0172     maxtime = max([[infortr.time], [inforcg.time], [infobfg.time]]);
0173     plot([infortr.time], [infortr.functiontime], '.-', ...
0174          [inforcg.time], [inforcg.functiontime], '.-', ...
0175          [infobfg.time], [infobfg.functiontime], '.-', ...
0176          [0, maxtime], [0, maxtime], 'k--');
0177     axis tight;
0178     xlabel('Total computation time [s]');
0179     ylabel(sprintf('Time spent in\ncost/grad/hess [s]'));
0180     
0181 end

Generated on Tue 19-May-2020 18:46:12 by m2html © 2005