Manifold of n-by-n doubly-stochastic matrices with positive entries. function M = multinomialdoublystochasticfactory(n) M is a Manopt manifold structure to optimize over the set of n-by-n matrices with (strictly) positive entries and such that the entries of each column and each row sum to one. Points on the manifold and tangent vectors are represented naturally as matrices of size n. The Riemannian metric imposed on the manifold is the Fisher metric, that is, if X is a point on the manifold and U, V are two tangent vectors: M.inner(X, U, V) = <U, V>_X = sum(sum(U.*V./X)). The retraction here provided is only first order. Consequently, the slope test in the checkhessian tool is only valid at points X where the gradient is zero. Furthermore, if some entries of X are very close to zero, this may cause numerical difficulties that can also lead to a failed slope test. More generally, it is important that the solution of the optimization problem should have positive entries, sufficiently far away from zero to avoid numerical issues. The file is based on developments in the research paper A. Douik and B. Hassibi, "Manifold Optimization Over the Set of Doubly Stochastic Matrices: A Second-Order Geometry" ArXiv:1802.02628, 2018. Link to the paper: https://arxiv.org/abs/1802.02628. Please cite the Manopt paper as well as the research paper: @Techreport{Douik2018Manifold, Title = {Manifold Optimization Over the Set of Doubly Stochastic Matrices: {A} Second-Order Geometry}, Author = {Douik, A. and Hassibi, B.}, Journal = {Arxiv preprint ArXiv:1802.02628}, Year = {2018} } See also: multinomialsymmetricfactory multinomialfactory
0001 function M = multinomialdoublystochasticfactory(n) 0002 % Manifold of n-by-n doubly-stochastic matrices with positive entries. 0003 % 0004 % function M = multinomialdoublystochasticfactory(n) 0005 % 0006 % M is a Manopt manifold structure to optimize over the set of n-by-n 0007 % matrices with (strictly) positive entries and such that the entries of 0008 % each column and each row sum to one. 0009 % 0010 % Points on the manifold and tangent vectors are represented naturally as 0011 % matrices of size n. The Riemannian metric imposed on the manifold is the 0012 % Fisher metric, that is, if X is a point on the manifold and U, V are two 0013 % tangent vectors: 0014 % 0015 % M.inner(X, U, V) = <U, V>_X = sum(sum(U.*V./X)). 0016 % 0017 % The retraction here provided is only first order. Consequently, the 0018 % slope test in the checkhessian tool is only valid at points X where the 0019 % gradient is zero. Furthermore, if some entries of X are very close to 0020 % zero, this may cause numerical difficulties that can also lead to a 0021 % failed slope test. More generally, it is important that the solution of 0022 % the optimization problem should have positive entries, sufficiently far 0023 % away from zero to avoid numerical issues. 0024 % 0025 % The file is based on developments in the research paper 0026 % A. Douik and B. Hassibi, "Manifold Optimization Over the Set 0027 % of Doubly Stochastic Matrices: A Second-Order Geometry" 0028 % ArXiv:1802.02628, 2018. 0029 % 0030 % Link to the paper: https://arxiv.org/abs/1802.02628. 0031 % 0032 % Please cite the Manopt paper as well as the research paper: 0033 % @Techreport{Douik2018Manifold, 0034 % Title = {Manifold Optimization Over the Set of Doubly Stochastic 0035 % Matrices: {A} Second-Order Geometry}, 0036 % Author = {Douik, A. and Hassibi, B.}, 0037 % Journal = {Arxiv preprint ArXiv:1802.02628}, 0038 % Year = {2018} 0039 % } 0040 % 0041 % See also: multinomialsymmetricfactory multinomialfactory 0042 0043 % This file is part of Manopt: www.manopt.org. 0044 % Original author: Ahmed Douik, March 6, 2018. 0045 % Contributors: Nicolas Boumal 0046 % Change log: 0047 % 0048 % Apr. 24, 2018 (AD): 0049 % Changed pinv() to a particular solution to the equation. 0050 % 0051 % July 24, 2018 (AD): 0052 % A bugfix related to the pinv() change, with effects in many places. 0053 % 0054 % Sep. 6, 2018 (NB): 0055 % Removed M.exp() as it was not implemented. 0056 % 0057 % Aug. 19, 2019 (AD, BM, NB): 0058 % Fixed typos in comments; replaced some linear solves with pcg 0059 % for efficiency when n is large. By default, pcg is used: 0060 % comments in the code indicate other possibilities and how they 0061 % differ. Added maxDSiters to doubly_stochastic argument. 0062 % The main change has been to factor out these linear solves. 0063 0064 e = ones(n, 1); 0065 0066 maxDSiters = 100 + 2*n; 0067 0068 function [alpha, beta] = mylinearsolve(X, b) 0069 % zeta = sparse(A)\b; % sparse might not be better perf.-wise. 0070 % where A = [eye(n) X ; X' eye(n)]; 0071 % 0072 % For large n use the pcg solver instead of \. 0073 % [zeta, ~, ~, iter] = pcg(sparse(A), b, 1e-6, 100); 0074 % 0075 % Even faster is to create a function handle 0076 % computing A*x (x is a given vector). 0077 % Make sure that A is not created, and X is only 0078 % passed with mylinearsolve and not A. 0079 [zeta, ~, ~, iter] = pcg(@mycompute, b, 1e-6, 100); 0080 0081 function Ax = mycompute(x) 0082 xtop = x(1:n,1); 0083 xbottom = x(n+1:end,1); 0084 Axtop = xtop + X*xbottom; 0085 Axbottom = X'*xtop + xbottom; 0086 Ax = [Axtop; Axbottom]; 0087 end 0088 0089 alpha = zeta(1:n, 1); 0090 beta = zeta(n+1:end, 1); 0091 end 0092 0093 M.name = @() sprintf('%dx%d doubly-stochastic matrices with positive entries', n, n); 0094 0095 M.dim = @() (n-1)^2; 0096 0097 % Fisher metric 0098 M.inner = @iproduct; 0099 function ip = iproduct(X, eta, zeta) 0100 ip = sum((eta(:).*zeta(:))./X(:)); 0101 end 0102 0103 M.norm = @(X, eta) sqrt(M.inner(X, eta, eta)); 0104 0105 M.dist = @(X, Y) error('multinomialdoublystochasticfactory.dist not implemented yet.'); 0106 0107 % The manifold is not compact as a result of the choice of the metric, 0108 % thus any choice here is arbitrary. This is notably used to pick 0109 % default values of initial and maximal trust-region radius in the 0110 % trustregions solver. 0111 M.typicaldist = @() n; 0112 0113 % Pick a random point on the manifold 0114 M.rand = @random; 0115 function X = random() 0116 Z = abs(randn(n, n)); % Random point in the ambient space 0117 X = doubly_stochastic(Z, maxDSiters); % Projection on the Manifold 0118 end 0119 0120 % Pick a random vector in the tangent space at X. 0121 M.randvec = @randomvec; 0122 function eta = randomvec(X) % A random vector in the tangent space 0123 % A random vector in the ambient space 0124 Z = randn(n, n); 0125 % Projection of the vector onto the tangent space 0126 b = [sum(Z, 2) ; sum(Z, 1)']; 0127 [alpha, beta] = mylinearsolve(X, b); 0128 eta = Z - (alpha*e' + e*beta').*X; 0129 % Normalizing the vector 0130 nrm = M.norm(X, eta); 0131 eta = eta / nrm; 0132 end 0133 0134 % Projection of vector eta in the ambient space to the tangent space. 0135 M.proj = @projection; 0136 function etaproj = projection(X, eta) % Projection of the vector eta in the ambeint space onto the tangent space 0137 b = [sum(eta, 2) ; sum(eta, 1)']; 0138 [alpha, beta] = mylinearsolve(X, b); 0139 etaproj = eta - (alpha*e' + e*beta').*X; 0140 end 0141 0142 M.tangent = M.proj; 0143 M.tangent2ambient = @(X, eta) eta; 0144 0145 % Conversion of Euclidean to Riemannian gradient 0146 M.egrad2rgrad = @egrad2rgrad; 0147 function rgrad = egrad2rgrad(X, egrad) % projection of the euclidean gradient 0148 mu = (X.*egrad); 0149 b = [sum(mu, 2) ; sum(mu, 1)']; 0150 [alpha, beta] = mylinearsolve(X, b); 0151 rgrad = mu - (alpha*e' + e*beta').*X; 0152 end 0153 0154 % First-order retraction 0155 M.retr = @retraction; 0156 function Y = retraction(X, eta, t) 0157 if nargin < 3 0158 t = 1.0; 0159 end 0160 Y = X.*exp(t*(eta./X)); 0161 Y = doubly_stochastic(Y, maxDSiters); 0162 Y = max(Y, eps); 0163 end 0164 0165 % Conversion of Euclidean to Riemannian Hessian 0166 M.ehess2rhess = @ehess2rhess; 0167 function rhess = ehess2rhess(X, egrad, ehess, eta) 0168 0169 % computing the directional derivative of the Riemannian 0170 % gradient 0171 gamma = egrad.*X; 0172 gammadot = ehess.*X + egrad.*eta; 0173 0174 b = [sum(gamma, 2) ; sum(gamma, 1)']; 0175 bdot = [sum(gammadot, 2) ; sum(gammadot, 1)']; 0176 [alpha, beta] = mylinearsolve(X, b); 0177 [alphadot, betadot] = mylinearsolve(X, bdot- [eta*beta; eta'*alpha]); 0178 0179 S = (alpha*e' + e*beta'); 0180 deltadot = gammadot - (alphadot*e' + e*betadot').*X- S.*eta; 0181 0182 % projecting gamma 0183 delta = gamma - S.*X; 0184 0185 % computing and projecting nabla 0186 nabla = deltadot - 0.5*(delta.*eta)./X; 0187 rhess = projection(X, nabla); 0188 end 0189 0190 0191 % Miscellaneous manifold functions 0192 M.hash = @(X) ['z' hashmd5(X(:))]; 0193 M.lincomb = @matrixlincomb; 0194 M.zerovec = @(X) zeros(n, n); 0195 M.transp = @(X1, X2, d) projection(X2, d); 0196 M.vec = @(X, U) U(:); 0197 M.mat = @(X, u) reshape(u, n, n); 0198 M.vecmatareisometries = @() false; 0199 0200 end