Home > manopt > tools > checkhessian.m

checkhessian

PURPOSE ^

Checks the consistency of the cost function and the Hessian.

SYNOPSIS ^

function checkhessian(problem, x, d)

DESCRIPTION ^

 Checks the consistency of the cost function and the Hessian.

 function checkhessian(problem)
 function checkhessian(problem, x)
 function checkhessian(problem, x, d)

 checkhessian performs a numerical test to check that the directional
 derivatives and Hessian defined in the problem structure agree up to
 second order with the cost function at some point x, along some direction
 d. The test is based on a truncated Taylor series (see online Manopt
 documentation).
 
 It is also tested that the result of applying the Hessian along that
 direction is indeed a tangent vector, and that the Hessian operator is
 symmetric w.r.t. the Riemannian metric.
 
 Both x and d are optional and will be sampled at random if omitted.

 The slope test requires the exponential map of the manifold, or at least
 a second-order retraction. If M.exp() is not available, the retraction is
 used and a message is issued to instruct the user to check whether M.retr
 is second-order or not. If it is not, the slope test is only valid at
 critical points of the cost function (which can be computed by
 optimization.)

 See also: checkdiff checkgradient checkretraction

CROSS-REFERENCE INFORMATION ^

This function calls: This function is called by:

SOURCE CODE ^

0001 function checkhessian(problem, x, d)
0002 % Checks the consistency of the cost function and the Hessian.
0003 %
0004 % function checkhessian(problem)
0005 % function checkhessian(problem, x)
0006 % function checkhessian(problem, x, d)
0007 %
0008 % checkhessian performs a numerical test to check that the directional
0009 % derivatives and Hessian defined in the problem structure agree up to
0010 % second order with the cost function at some point x, along some direction
0011 % d. The test is based on a truncated Taylor series (see online Manopt
0012 % documentation).
0013 %
0014 % It is also tested that the result of applying the Hessian along that
0015 % direction is indeed a tangent vector, and that the Hessian operator is
0016 % symmetric w.r.t. the Riemannian metric.
0017 %
0018 % Both x and d are optional and will be sampled at random if omitted.
0019 %
0020 % The slope test requires the exponential map of the manifold, or at least
0021 % a second-order retraction. If M.exp() is not available, the retraction is
0022 % used and a message is issued to instruct the user to check whether M.retr
0023 % is second-order or not. If it is not, the slope test is only valid at
0024 % critical points of the cost function (which can be computed by
0025 % optimization.)
0026 %
0027 % See also: checkdiff checkgradient checkretraction
0028 
0029 % This file is part of Manopt: www.manopt.org.
0030 % Original author: Nicolas Boumal, Dec. 30, 2012.
0031 % Contributors:
0032 % Change log:
0033 %
0034 %   April 3, 2015 (NB):
0035 %       Works with the new StoreDB class system.
0036 %
0037 %   Nov. 1, 2016 (NB):
0038 %       Issues a call to getGradient rather than getDirectionalDerivative.
0039 %
0040 %   March 26, 2017 (JB):
0041 %       Detects if the approximated quadratic model is exact
0042 %       and provides the user with the corresponding feedback.
0043 %
0044 %   Dec. 6, 2017 (NB):
0045 %       Added message in case tangent2ambient might be necessary in
0046 %       defining ehess (this was a common difficulty for users.)
0047 %
0048 %   Aug. 2, 2018 (NB):
0049 %       Using storedb.remove() to avoid unnecessary cache build-up.
0050 %
0051 %   Sep. 6, 2018 (NB):
0052 %       Now checks whether M.exp() is available; uses retraction otherwise
0053 %       and issues a message that the user should check whether the
0054 %       retraction is second-order or not.
0055 %
0056 %   Feb. 1, 2020 (NB):
0057 %       Added an explicit linearity check.
0058 
0059         
0060     % Verify that the problem description is sufficient.
0061     if ~canGetCost(problem)
0062         error('It seems no cost was provided.');
0063     end
0064     if ~canGetGradient(problem)
0065         warning('manopt:checkhessian:nograd', ...
0066                 'It seems no gradient was provided.');
0067     end
0068     if ~canGetHessian(problem)
0069         warning('manopt:checkhessian:nohess', ...
0070                 'It seems no Hessian was provided.');
0071     end
0072     
0073     x_isprovided = exist('x', 'var') && ~isempty(x);
0074     d_isprovided = exist('d', 'var') && ~isempty(d);
0075     
0076     if ~x_isprovided && d_isprovided
0077         error('If d is provided, x must be too, since d is tangent at x.');
0078     end
0079     
0080     % If x and / or d are not specified, pick them at random.
0081     if ~x_isprovided
0082         x = problem.M.rand();
0083     end
0084     if ~d_isprovided
0085         d = problem.M.randvec(x);
0086     end
0087     
0088     %% Check that the directional derivative and the Hessian at x along d
0089     %% yield a second order model of the cost function.
0090     
0091     % Compute the value f0 at f, directional derivative df0 at x along d,
0092     % and Hessian along [d, d].
0093     storedb = StoreDB();
0094     xkey = storedb.getNewKey();
0095     f0 = getCost(problem, x, storedb, xkey);
0096     gradx = getGradient(problem, x, storedb, xkey);
0097     df0 = problem.M.inner(x, d, gradx);
0098     hessxd = getHessian(problem, x, d, storedb, xkey);
0099     d2f0 = problem.M.inner(x, d, hessxd);
0100     
0101     
0102     % Pick a stepping function: exponential or retraction?
0103     if isfield(problem.M, 'exp')
0104         stepper = problem.M.exp;
0105         extra_message = '';
0106     else
0107         stepper = problem.M.retr;
0108         fprintf(['* M.exp() is not available: using M.retr() instead.\n' ...
0109                  '* Please check the manifold documentation to see if\n' ...
0110                  '* the retraction is second order. If not, the slope\n' ...
0111                  '* test is allowed to fail at non-critical x.\n']);
0112         extra_message = ['(But do mind the message above: the slope may\n' ...
0113                          'be allowed to be 2 at non-critical points x.)\n'];
0114     end
0115     
0116     
0117     % Compute the value of f at points on the geodesic (or approximation
0118     % of it) originating from x, along direction d, for stepsizes in a
0119     % large range given by h.
0120     h = logspace(-8, 0, 51);
0121     value = zeros(size(h));
0122     for i = 1 : length(h)
0123         y = stepper(x, d, h(i));
0124         ykey = storedb.getNewKey();
0125         value(i) = getCost(problem, y, storedb, ykey);
0126         storedb.remove(ykey); % no need to keep it in memory
0127     end
0128     
0129     % Compute the quadratic approximation of the cost function using f0,
0130     % df0 and d2f0 at the same points.
0131     model = polyval([.5*d2f0 df0 f0], h);
0132     
0133     % Compute the approximation error
0134     err = abs(model - value);
0135     
0136     % And plot it.
0137     loglog(h, err);
0138     title(sprintf(['Hessian check.\nThe slope of the continuous line ' ...
0139                    'should match that of the dashed\n(reference) line ' ...
0140                    'over at least a few orders of magnitude for h.']));
0141     xlabel('h');
0142     ylabel('Approximation error');
0143     
0144     line('xdata', [1e-8 1e0], 'ydata', [1e-16 1e8], ...
0145          'color', 'k', 'LineStyle', '--', ...
0146          'YLimInclude', 'off', 'XLimInclude', 'off');
0147     
0148     
0149     if ~all( err < 1e-12 )
0150         % In a numerically reasonable neighborhood, the error should
0151         % decrease as the cube of the stepsize, i.e., in loglog scale, the
0152         % error should have a slope of 3.
0153         isModelExact = false;
0154         window_len = 10;
0155         [range, poly] = identify_linear_piece(log10(h), log10(err), window_len);
0156     else
0157         % The 2nd order model is exact: all errors are (numerically) zero
0158         % Fit line from all points, use log scale only in h.
0159         isModelExact = true;
0160         range = 1:numel(h);
0161         poly = polyfit(log10(h), err, 1);
0162         % Set mean error in log scale for plot
0163         poly(end) = log10(poly(end));
0164         % Change title to something more descriptive for this special case.
0165         title(sprintf(...
0166               ['Hessian check.\n'...
0167                'It seems the quadratic model is exact:\n'...
0168                'Model error is numerically zero for all h.']));
0169     end
0170     hold all;
0171     loglog(h(range), 10.^polyval(poly, log10(h(range))), 'LineWidth', 3);
0172     hold off;
0173     
0174     if ~isModelExact
0175         fprintf('The slope should be 3. It appears to be: %g.\n', poly(1));
0176         fprintf(['If it is far from 3, then directional derivatives,\n' ...
0177                  'the gradient or the Hessian might be erroneous.\n', ...
0178                  extra_message]);
0179     else
0180         fprintf(['The quadratic model appears to be exact ' ...
0181                  '(within numerical precision),\n'...
0182                  'hence the slope computation is irrelevant.\n']);
0183     end
0184 
0185     
0186     %% Check that the Hessian at x along direction d is a tangent vector.
0187     if isfield(problem.M, 'tangent')
0188         hess = getHessian(problem, x, d, storedb, xkey);
0189         phess = problem.M.tangent(x, hess);
0190         residual = problem.M.lincomb(x, 1, hess, -1, phess);
0191         err = problem.M.norm(x, residual);
0192         fprintf('Tangency residual should be zero, or very close; ');
0193         fprintf('residual: %g.\n', err);
0194         fprintf(['If it is far from 0, then the Hessian is not in the ' ...
0195                  'tangent space.\n']);
0196     else
0197         fprintf(['Unfortunately, Manopt was unable to verify that the '...
0198                  'output of the Hessian call is indeed a tangent ' ...
0199                  'vector.\nPlease verify this manually.']);
0200     end    
0201     
0202     %% Check that the Hessian at x is linear and symmetric.
0203     d1 = problem.M.randvec(x);
0204     d2 = problem.M.randvec(x);
0205     h1 = getHessian(problem, x, d1, storedb, xkey);
0206     h2 = getHessian(problem, x, d2, storedb, xkey);
0207     
0208     % Linearity check
0209     a = randn(1);
0210     b = randn(1);
0211     ad1pbd2 = problem.M.lincomb(x, a, d1, b, d2);
0212     had1pbd2 = getHessian(problem, x, ad1pbd2, storedb, xkey);
0213     ahd1pbhd2 = problem.M.lincomb(x, a, h1, b, h2);
0214     errvec = problem.M.lincomb(x, 1, had1pbd2, -1, ahd1pbhd2);
0215     errvecnrm = problem.M.norm(x, errvec);
0216     had1pbd2nrm = problem.M.norm(x, had1pbd2);
0217     fprintf(['||a*H[d1] + b*H[d2] - H[a*d1+b*d2]|| should be zero, or ' ...
0218              'very close.\n\tValue: %g (norm of H[a*d1+b*d2]: %g)\n'], ...
0219              errvecnrm, had1pbd2nrm);
0220     fprintf('If it is far from 0, then the Hessian is not linear.\n');
0221     
0222     % Symmetry check
0223     v1 = problem.M.inner(x, d1, h2);
0224     v2 = problem.M.inner(x, h1, d2);
0225     value = v1-v2;
0226     fprintf(['<d1, H[d2]> - <H[d1], d2> should be zero, or very close.' ...
0227              '\n\tValue: %g - %g = %g.\n'], v1, v2, value);
0228     fprintf('If it is far from 0, then the Hessian is not symmetric.\n');
0229     
0230     %% Check if the manifold at hand is one of those for which there should
0231     %  be a call to M.tangent2ambient, as this is a common mistake. If so,
0232     %  issue an additional message. Ideally, one would just need to check
0233     %  for the presence of tangent2ambient, but productmanifold (for
0234     %  example) generates one of those in all cases, even if it is just an
0235     %  identity map.
0236     if isfield(problem.M, 'tangent2ambient_is_identity') && ...
0237                                      ~problem.M.tangent2ambient_is_identity
0238         
0239         fprintf('\n\n=== Special message ===\n');
0240         fprintf(['For this manifold, tangent vectors are represented\n' ...
0241                  'differently from their ambient space representation.\n' ...
0242                  'In practice, this means that when defining\n' ...
0243                  'v = problem.ehess(x, u), one may need to call\n' ...
0244                  'u = problem.M.tangent2ambient(x, u) first, so as to\n'...
0245                  'transform u into an ambient vector, if this is more\n' ...
0246                  'convenient. The output of ehess should be an ambient\n' ...
0247                  'vector (it will be transformed to a tangent vector\n' ...
0248                  'automatically).\n']);
0249         
0250     end
0251 
0252     if ~canGetHessian(problem)
0253         norm_grad = problem.M.norm(x, gradx);
0254         fprintf(['\nWhen using checkhessian with a finite difference ' ...
0255                  'approximation, the norm of the residual\nshould be ' ...
0256                  'compared against the norm of the gradient at the ' ...
0257                  'point under consideration (%.2g).\nFurthermore, it ' ...
0258                  'is expected that the FD operator is only approximately' ...
0259                  ' symmetric.\nOf course, the slope can also be off.\n'], ...
0260                  norm_grad);
0261     end
0262     
0263 end

Generated on Fri 30-Sep-2022 13:18:25 by m2html © 2005