Returns an SVD decomposition of A truncated to rank p. function [U, S, V, info] = truncated_svd(A, p) Input: A real matrix A of size mxn and an integer p <= min(m, n). Output: An orthonormal matrix U of size mxp, an orthonormal matrix Y of size nxp and a diagonal matrix S of size pxp with nonnegative and decreasing diagonal entries such that USV.' is the best rank p approximation of A according to the Frobenius norm. All real. This function produces an output akin to svds. The decomposition is obtained by maximizing f(U, V) = .5*norm(U'*A*V, 'fro')^2 where U, V are orthonormal. Notice that f(U*Q, V*R) = f(U, V) for all Q, R orthogonal pxp matrices. Hence, only the column spaces of U and V matter and we may perform the optimization over a product of two Grassmannian manifolds. It is easy to show that maximizing f is equivalent to minimizing g with g(U, V) = min_S norm(U*S*V' - A, 'fro')^2, which confirms that we are going for a best low-rank approximation of A. The inner workings of the Grassmann manifold use the built-in svd function of Matlab but only for matrices of size mxp and nxp to re-orthonormalize them. Notice that we are actually chasing a best fixed-rank approximation of a matrix, which is best obtained by working directly over a manifold of fixed-rank matrices. This is simply an example script to demonstrate some functionalities of the toolbox. The code can be modified to accept a function handle for A(x) = A*x instead of a matrix A, which is often useful. This would further require a function handle At for the transpose of A, such that At(x) = A.'*x.

- grassmannfactory Returns a manifold struct to optimize over the space of vector subspaces.
- trustregions Riemannian trust-regions solver for optimization on manifolds.
- hessianspectrum Returns the eigenvalues of the (preconditioned) Hessian at x.
- productmanifold Returns a structure describing a product manifold M = M1 x M2 x ... x Mn.

0001 function [U, S, V, info] = truncated_svd(A, p) 0002 % Returns an SVD decomposition of A truncated to rank p. 0003 % 0004 % function [U, S, V, info] = truncated_svd(A, p) 0005 % 0006 % Input: A real matrix A of size mxn and an integer p <= min(m, n). 0007 % Output: An orthonormal matrix U of size mxp, an orthonormal matrix Y of 0008 % size nxp and a diagonal matrix S of size pxp with nonnegative and 0009 % decreasing diagonal entries such that USV.' is the best rank p 0010 % approximation of A according to the Frobenius norm. All real. 0011 % This function produces an output akin to svds. 0012 % 0013 % The decomposition is obtained by maximizing 0014 % f(U, V) = .5*norm(U'*A*V, 'fro')^2 0015 % where U, V are orthonormal. Notice that f(U*Q, V*R) = f(U, V) for all 0016 % Q, R orthogonal pxp matrices. Hence, only the column spaces of U and V 0017 % matter and we may perform the optimization over a product of two 0018 % Grassmannian manifolds. 0019 % 0020 % It is easy to show that maximizing f is equivalent to minimizing g with 0021 % g(U, V) = min_S norm(U*S*V' - A, 'fro')^2, 0022 % which confirms that we are going for a best low-rank approximation of A. 0023 % 0024 % The inner workings of the Grassmann manifold use the built-in svd 0025 % function of Matlab but only for matrices of size mxp and nxp to 0026 % re-orthonormalize them. 0027 % 0028 % Notice that we are actually chasing a best fixed-rank approximation of a 0029 % matrix, which is best obtained by working directly over a manifold of 0030 % fixed-rank matrices. This is simply an example script to demonstrate some 0031 % functionalities of the toolbox. 0032 % 0033 % The code can be modified to accept a function handle for A(x) = A*x 0034 % instead of a matrix A, which is often useful. This would further require 0035 % a function handle At for the transpose of A, such that At(x) = A.'*x. 0036 0037 % This file is part of Manopt and is copyrighted. See the license file. 0038 % 0039 % Main author: Nicolas Boumal, July 5, 2013 0040 % Contributors: 0041 % 0042 % Change log: 0043 % 0044 0045 0046 % Generate some random data to test the function if none is given. 0047 if ~exist('A', 'var') || isempty(A) 0048 A = randn(42, 60); 0049 end 0050 if ~exist('p', 'var') || isempty(p) 0051 p = 5; 0052 end 0053 0054 % Retrieve the size of the problem and make sure the requested 0055 % approximation rank is at most the maximum possible rank. 0056 [m, n] = size(A); 0057 assert(p <= min(m, n), 'p must be smaller than the smallest dimension of A.'); 0058 0059 % Define the cost and its derivatives on the Grassmann manifold 0060 tuple.U = grassmannfactory(m, p); 0061 tuple.V = grassmannfactory(n, p); 0062 % All of the code will work just as well if we ignore the invariance 0063 % property of the cost function indicated above and thus place U and V 0064 % on the Stiefel manifold (orthonormal matrices) instead of the 0065 % Grassmann manifold. Working on Stiefel is expected to be slower 0066 % though, partly because de search space is higher dimensional and 0067 % partly because the optimizers are not isolated. 0068 % tuple.U = stiefelfactory(m, p); 0069 % tuple.V = stiefelfactory(n, p); 0070 M = productmanifold(tuple); 0071 0072 % Define a problem structure, specifying the manifold M, the cost 0073 % function and its derivatives. Here, to demonstrate the rapid 0074 % prototyping capabilities of Manopt, we directly define the Euclidean 0075 % gradient and the Euclidean Hessian egrad and ehess instead of the 0076 % Riemannian gradient and Hessian grad and hess. Manopt will take care 0077 % of the conversion. This automatic conversion is usually not 0078 % computationally optimal though, because much of the computations 0079 % involved in obtaining the gradient could be reused to obtain the 0080 % Hessian. After the prototyping stage, when efficiency becomes 0081 % important, it makes sense to define grad and hess rather than egrad 0082 % an ehess, and to use the caching system (the store structure). 0083 problem.M = M; 0084 problem.cost = @cost; 0085 problem.egrad = @egrad; 0086 problem.ehess = @ehess; 0087 0088 % The functions below make many redundant computations. This 0089 % performance hit can be alleviated by using the caching system. 0090 0091 % Cost function 0092 function f = cost(X) 0093 U = X.U; 0094 V = X.V; 0095 f = -.5*norm(U'*A*V, 'fro')^2; 0096 end 0097 % Euclidean gradient of the cost function 0098 function g = egrad(X) 0099 U = X.U; 0100 V = X.V; 0101 AV = A*V; 0102 AtU = A'*U; 0103 g.U = -AV*(AV'*U); 0104 g.V = -AtU*(AtU'*V); 0105 end 0106 % Euclidean Hessian of the cost function 0107 function h = ehess(X, H) 0108 U = X.U; 0109 V = X.V; 0110 Udot = H.U; 0111 Vdot = H.V; 0112 AV = A*V; 0113 AtU = A'*U; 0114 AVdot = A*Vdot; 0115 AtUdot = A'*Udot; 0116 h.U = -(AVdot*AV'*U + AV*AVdot'*U + AV*AV'*Udot); 0117 h.V = -(AtUdot*AtU'*V + AtU*AtUdot'*V + AtU*AtU'*Vdot); 0118 end 0119 0120 0121 % Execute some checks on the derivatives for early debugging. 0122 % These things can be commented out of course. 0123 % checkgradient(problem); 0124 % pause; 0125 % checkhessian(problem); 0126 % pause; 0127 0128 % Issue a call to a solver. A random initial guess will be chosen and 0129 % default options are selected. Here, we specify a maximum trust 0130 % region radius (which in turn induces an initial trust region radius). 0131 % Note that this is not required: default values are used if we omit 0132 % this. The diameter of the manifold scales like sqrt(2*p), hence the 0133 % form of our (empirical) choice. 0134 options.Delta_bar = 4*sqrt(2*p); 0135 [X, Xcost, info] = trustregions(problem, [], options); %#ok<ASGLU> 0136 U = X.U; 0137 V = X.V; 0138 0139 % Finish the job by rotating U and V such that the middle matrix S can 0140 % be diagonal with nonnegative, decreasing entries. This requires a 0141 % small svd of size pxp. 0142 Spp = U'*A*V; 0143 [Upp, Spp, Vpp] = svd(Spp); 0144 U = U*Upp; 0145 S = Spp; 0146 V = V*Vpp; 0147 0148 % For our information, Manopt can also compute the spectrum of the 0149 % Riemannian Hessian on the tangent space at (any) X. Computing the 0150 % spectrum at the solution gives us some idea of the conditioning of 0151 % the problem. If we were to implement a preconditioner for the 0152 % Hessian, this would also inform us on its performance. 0153 % 0154 % Notice that if the optimization is performed on a product of Stiefel 0155 % manifolds instead of a product of Grassmannians, the double 0156 % invariance under the orthogonal group O(p) will appear as twice 0157 % p*(p-1)/2, thus p*(p-1) zero eigenvalues in the spectrum of the 0158 % Hessian. This means that the minimizers are not isolated, which 0159 % typically hinders convergence of second order algorithms. 0160 if M.dim() < 512 0161 evs = hessianspectrum(problem, X); 0162 stairs(sort(evs)); 0163 title(['Eigenvalues of the Hessian of the cost function ' ... 0164 'at the solution']); 0165 end 0166 0167 end

Generated on Mon 10-Sep-2018 11:48:06 by