Home > manopt > tools > criticalpointfinder.m

criticalpointfinder

PURPOSE ^

Creates a Manopt problem whose optima are the critical points of another.

SYNOPSIS ^

function problem_critpt = criticalpointfinder(problem)

DESCRIPTION ^

 Creates a Manopt problem whose optima are the critical points of another.

 problem_critpt = criticalpointfinder(problem)

 Given a Manopt problem structure 'problem', this tool returns a new
 problem structure, 'problem_critpt', such that the global optima of the
 new problem coincide with the critical points of the original problem.
 This can be useful notably in empirical studies of the properties of
 saddle points of a problem.

 Concretely, if f is the cost function of the given problem, grad f
 denotes its (Riemannian) gradient and Hess f denotes its (Riemannian)
 Hessian, then the new problem has a cost function g defined by:

   g(x) = (1/2)*norm(grad f(x))^2,

 where x is a point on the manifold problem.M (the new problem lives on
 the same manifold), and norm(.) = problem.M.norm(x, .) is the Riemannian
 norm on the tangent space at x. The Riemannian gradient of g is elegantly
 obtained from knowledge of f:

   grad g(x) = Hess f(x)[grad f(x)]

 If the Hessian of f is not available in the given problem, Manopt will
 approximate it automatically to compute an approximate gradient of g.
 If the Hessian of f is available, then an approximate Hessian of g is
 defined in the returned problem as

  approxhess g(x)[u] = Hess f(x)[ Hess f(x)[u] ].

 This approximation is exact if x is a critical point of f, which is
 enough to ensure superlinear local convergence to critical points of f
 using the trustregions algorithm, for example.

 Once problem_critpt is obtained, it can be passed to any of the solvers
 of Manopt to compute critical points of the original problem. Supplying
 an initial point to the solver allows to aim for a critical point in a
 specific neighborhood of the search space.


 Usage example:
 
 The code below creates a problem whose optima are dominant eigenvectors
 of a matrix A and whose critical points are any eigenvectors of A, then
 compute critical points using the present tool:

 n = 100; A = randn(n); A = .5*(A+A');
 problem.M = spherefactory(n);
 problem.cost  = @(x) -x'*(A*x);
 problem.egrad = @(x) -2*A*x;
 problem.ehess = @(x, xdot) -2*A*xdot;
 problem_critpt = criticalpointfinder(problem);
 opts.tolcost = .5*(1e-5)^2; % aim for a gradient smaller than 1e-5
 [x, fx] = trustregions(problem_critpt, [], opts); % random initial guess
 fprintf('Norm of the gradient at x: %g\n', sqrt(2*fx));
 fprintf('This is small if x is close to being an eigenvector: %g\n',...
         norm((x'*A*x)*x - A*x));
 % The two displayed numbers are equal up to a factor 2.


 See also: trustregions

CROSS-REFERENCE INFORMATION ^

This function calls: This function is called by:

SUBFUNCTIONS ^

SOURCE CODE ^

0001 function problem_critpt = criticalpointfinder(problem)
0002 % Creates a Manopt problem whose optima are the critical points of another.
0003 %
0004 % problem_critpt = criticalpointfinder(problem)
0005 %
0006 % Given a Manopt problem structure 'problem', this tool returns a new
0007 % problem structure, 'problem_critpt', such that the global optima of the
0008 % new problem coincide with the critical points of the original problem.
0009 % This can be useful notably in empirical studies of the properties of
0010 % saddle points of a problem.
0011 %
0012 % Concretely, if f is the cost function of the given problem, grad f
0013 % denotes its (Riemannian) gradient and Hess f denotes its (Riemannian)
0014 % Hessian, then the new problem has a cost function g defined by:
0015 %
0016 %   g(x) = (1/2)*norm(grad f(x))^2,
0017 %
0018 % where x is a point on the manifold problem.M (the new problem lives on
0019 % the same manifold), and norm(.) = problem.M.norm(x, .) is the Riemannian
0020 % norm on the tangent space at x. The Riemannian gradient of g is elegantly
0021 % obtained from knowledge of f:
0022 %
0023 %   grad g(x) = Hess f(x)[grad f(x)]
0024 %
0025 % If the Hessian of f is not available in the given problem, Manopt will
0026 % approximate it automatically to compute an approximate gradient of g.
0027 % If the Hessian of f is available, then an approximate Hessian of g is
0028 % defined in the returned problem as
0029 %
0030 %  approxhess g(x)[u] = Hess f(x)[ Hess f(x)[u] ].
0031 %
0032 % This approximation is exact if x is a critical point of f, which is
0033 % enough to ensure superlinear local convergence to critical points of f
0034 % using the trustregions algorithm, for example.
0035 %
0036 % Once problem_critpt is obtained, it can be passed to any of the solvers
0037 % of Manopt to compute critical points of the original problem. Supplying
0038 % an initial point to the solver allows to aim for a critical point in a
0039 % specific neighborhood of the search space.
0040 %
0041 %
0042 % Usage example:
0043 %
0044 % The code below creates a problem whose optima are dominant eigenvectors
0045 % of a matrix A and whose critical points are any eigenvectors of A, then
0046 % compute critical points using the present tool:
0047 %
0048 % n = 100; A = randn(n); A = .5*(A+A');
0049 % problem.M = spherefactory(n);
0050 % problem.cost  = @(x) -x'*(A*x);
0051 % problem.egrad = @(x) -2*A*x;
0052 % problem.ehess = @(x, xdot) -2*A*xdot;
0053 % problem_critpt = criticalpointfinder(problem);
0054 % opts.tolcost = .5*(1e-5)^2; % aim for a gradient smaller than 1e-5
0055 % [x, fx] = trustregions(problem_critpt, [], opts); % random initial guess
0056 % fprintf('Norm of the gradient at x: %g\n', sqrt(2*fx));
0057 % fprintf('This is small if x is close to being an eigenvector: %g\n',...
0058 %         norm((x'*A*x)*x - A*x));
0059 % % The two displayed numbers are equal up to a factor 2.
0060 %
0061 %
0062 % See also: trustregions
0063 
0064 % This file is part of Manopt: www.manopt.org.
0065 % Original author: Nicolas Boumal, Jan. 25, 2017.
0066 % Contributors:
0067 % Change log:
0068 
0069 % TODO: Determine a safe way of using the caching functionalities of Manopt
0070 %       with this tool. The issue in passing along storedb and key in the
0071 %       costgrad and approxhess functions is that the storedb will be
0072 %       associated to problem_critpt, not to problem. This may cause bugs
0073 %       that would be very difficult to catch. To be on the safe side,
0074 %       caching is not used at all here, but this may cause running times
0075 %       to be longer than necessary. To create a local storedb associated
0076 %       to problem and to only use the key seems to also not be a viable
0077 %       solution, since there is no clear way of resetting it to zero
0078 %       everytime a solver is called on problem_critpt.
0079 %       -- Jan. 26, 2017 (NB)
0080 
0081     problem_critpt.M = problem.M;
0082     problem_critpt.costgrad = @costgrad;
0083     
0084     % If the Hessian is available for the problem, we build an approximate
0085     % Hessian based on it. Otherwise, there is no reason to believe that
0086     % this approximate Hessian would be better than the standard
0087     % approximate Hessian created by Manopt.
0088     if canGetHessian(problem)
0089         problem_critpt.approxhess = @approxhess;
0090     end
0091     
0092     function [g, gradg] = costgrad(x)
0093         
0094         gradf = getGradient(problem, x);
0095         Hessf_gradf = getHessian(problem, x, gradf);
0096         
0097         g = .5*problem.M.norm(x, gradf)^2;
0098         gradg = Hessf_gradf;
0099         
0100     end
0101     
0102     % This is not quite the Hessian because there should be a third-order
0103     % derivative term (which is inaccessible), but: at critical points
0104     % (where grad f(x) = 0 for the f of problem.cost) this Hessian is
0105     % exact, so it will allow for superlinear local convergence in
0106     % algorithms such as trustregions.
0107     function HHu = approxhess(x, u)
0108         
0109         Hu  = getHessian(problem, x, u);
0110         HHu = getHessian(problem, x, Hu);
0111         
0112     end
0113 
0114 end

Generated on Fri 30-Sep-2022 13:18:25 by m2html © 2005