Creates a Manopt problem whose optima are the critical points of another. problem_critpt = criticalpointfinder(problem) Given a Manopt problem structure 'problem', this tool returns a new problem structure, 'problem_critpt', such that the global optima of the new problem coincide with the critical points of the original problem. This can be useful notably in empirical studies of the properties of saddle points of a problem. Concretely, if f is the cost function of the given problem, grad f denotes its (Riemannian) gradient and Hess f denotes its (Riemannian) Hessian, then the new problem has a cost function g defined by: g(x) = (1/2)*norm(grad f(x))^2, where x is a point on the manifold problem.M (the new problem lives on the same manifold), and norm(.) = problem.M.norm(x, .) is the Riemannian norm on the tangent space at x. The Riemannian gradient of g is elegantly obtained from knowledge of f: grad g(x) = Hess f(x)[grad f(x)] If the Hessian of f is not available in the given problem, Manopt will approximate it automatically to compute an approximate gradient of g. If the Hessian of f is available, then an approximate Hessian of g is defined in the returned problem as approxhess g(x)[u] = Hess f(x)[ Hess f(x)[u] ]. This approximation is exact if x is a critical point of f, which is enough to ensure superlinear local convergence to critical points of f using the trustregions algorithm, for example. Once problem_critpt is obtained, it can be passed to any of the solvers of Manopt to compute critical points of the original problem. Supplying an initial point to the solver allows to aim for a critical point in a specific neighborhood of the search space. Usage example: The code below creates a problem whose optima are dominant eigenvectors of a matrix A and whose critical points are any eigenvectors of A, then compute critical points using the present tool: n = 100; A = randn(n); A = .5*(A+A'); problem.M = spherefactory(n); problem.cost = @(x) -x'*(A*x); problem.egrad = @(x) -2*A*x; problem.ehess = @(x, xdot) -2*A*xdot; problem_critpt = criticalpointfinder(problem); opts.tolcost = .5*(1e-5)^2; % aim for a gradient smaller than 1e-5 [x, fx] = trustregions(problem_critpt, [], opts); % random initial guess fprintf('Norm of the gradient at x: %g\n', sqrt(2*fx)); fprintf('This is small if x is close to being an eigenvector: %g\n',... norm((x'*A*x)*x - A*x)); % The two displayed numbers are equal up to a factor 2. See also: trustregions
0001 function problem_critpt = criticalpointfinder(problem) 0002 % Creates a Manopt problem whose optima are the critical points of another. 0003 % 0004 % problem_critpt = criticalpointfinder(problem) 0005 % 0006 % Given a Manopt problem structure 'problem', this tool returns a new 0007 % problem structure, 'problem_critpt', such that the global optima of the 0008 % new problem coincide with the critical points of the original problem. 0009 % This can be useful notably in empirical studies of the properties of 0010 % saddle points of a problem. 0011 % 0012 % Concretely, if f is the cost function of the given problem, grad f 0013 % denotes its (Riemannian) gradient and Hess f denotes its (Riemannian) 0014 % Hessian, then the new problem has a cost function g defined by: 0015 % 0016 % g(x) = (1/2)*norm(grad f(x))^2, 0017 % 0018 % where x is a point on the manifold problem.M (the new problem lives on 0019 % the same manifold), and norm(.) = problem.M.norm(x, .) is the Riemannian 0020 % norm on the tangent space at x. The Riemannian gradient of g is elegantly 0021 % obtained from knowledge of f: 0022 % 0023 % grad g(x) = Hess f(x)[grad f(x)] 0024 % 0025 % If the Hessian of f is not available in the given problem, Manopt will 0026 % approximate it automatically to compute an approximate gradient of g. 0027 % If the Hessian of f is available, then an approximate Hessian of g is 0028 % defined in the returned problem as 0029 % 0030 % approxhess g(x)[u] = Hess f(x)[ Hess f(x)[u] ]. 0031 % 0032 % This approximation is exact if x is a critical point of f, which is 0033 % enough to ensure superlinear local convergence to critical points of f 0034 % using the trustregions algorithm, for example. 0035 % 0036 % Once problem_critpt is obtained, it can be passed to any of the solvers 0037 % of Manopt to compute critical points of the original problem. Supplying 0038 % an initial point to the solver allows to aim for a critical point in a 0039 % specific neighborhood of the search space. 0040 % 0041 % 0042 % Usage example: 0043 % 0044 % The code below creates a problem whose optima are dominant eigenvectors 0045 % of a matrix A and whose critical points are any eigenvectors of A, then 0046 % compute critical points using the present tool: 0047 % 0048 % n = 100; A = randn(n); A = .5*(A+A'); 0049 % problem.M = spherefactory(n); 0050 % problem.cost = @(x) -x'*(A*x); 0051 % problem.egrad = @(x) -2*A*x; 0052 % problem.ehess = @(x, xdot) -2*A*xdot; 0053 % problem_critpt = criticalpointfinder(problem); 0054 % opts.tolcost = .5*(1e-5)^2; % aim for a gradient smaller than 1e-5 0055 % [x, fx] = trustregions(problem_critpt, [], opts); % random initial guess 0056 % fprintf('Norm of the gradient at x: %g\n', sqrt(2*fx)); 0057 % fprintf('This is small if x is close to being an eigenvector: %g\n',... 0058 % norm((x'*A*x)*x - A*x)); 0059 % % The two displayed numbers are equal up to a factor 2. 0060 % 0061 % 0062 % See also: trustregions 0063 0064 % This file is part of Manopt: www.manopt.org. 0065 % Original author: Nicolas Boumal, Jan. 25, 2017. 0066 % Contributors: 0067 % Change log: 0068 0069 % TODO: Determine a safe way of using the caching functionalities of Manopt 0070 % with this tool. The issue in passing along storedb and key in the 0071 % costgrad and approxhess functions is that the storedb will be 0072 % associated to problem_critpt, not to problem. This may cause bugs 0073 % that would be very difficult to catch. To be on the safe side, 0074 % caching is not used at all here, but this may cause running times 0075 % to be longer than necessary. To create a local storedb associated 0076 % to problem and to only use the key seems to also not be a viable 0077 % solution, since there is no clear way of resetting it to zero 0078 % everytime a solver is called on problem_critpt. 0079 % -- Jan. 26, 2017 (NB) 0080 0081 problem_critpt.M = problem.M; 0082 problem_critpt.costgrad = @costgrad; 0083 0084 % If the Hessian is available for the problem, we build an approximate 0085 % Hessian based on it. Otherwise, there is no reason to believe that 0086 % this approximate Hessian would be better than the standard 0087 % approximate Hessian created by Manopt. 0088 if canGetHessian(problem) 0089 problem_critpt.approxhess = @approxhess; 0090 end 0091 0092 function [g, gradg] = costgrad(x) 0093 0094 gradf = getGradient(problem, x); 0095 Hessf_gradf = getHessian(problem, x, gradf); 0096 0097 g = .5*problem.M.norm(x, gradf)^2; 0098 gradg = Hessf_gradf; 0099 0100 end 0101 0102 % This is not quite the Hessian because there should be a third-order 0103 % derivative term (which is inaccessible), but: at critical points 0104 % (where grad f(x) = 0 for the f of problem.cost) this Hessian is 0105 % exact, so it will allow for superlinear local convergence in 0106 % algorithms such as trustregions. 0107 function HHu = approxhess(x, u) 0108 0109 Hu = getHessian(problem, x, u); 0110 HHu = getHessian(problem, x, Hu); 0111 0112 end 0113 0114 end