Manifold of n-by-n symmetric stochastic matrices with positive entries. function M = multinomialsymmetricfactory(n) M is a Manopt manifold structure to optimize over the set of n-by-n symmetric matrices with (strictly) positive entries and such that the entries of each column and each row sum to one. Points on the manifold and tangent vectors are represented naturally as symmetric matrices of size n. The Riemannian metric imposed on the manifold is the Fisher metric, that is, if X is a point on the manifold and U, V are two tangent vectors: M.inner(X, U, V) = <U, V>_X = sum(sum(U.*V./X)). The retraction here provided is only first order. Consequently, the slope test in the checkhessian tool is only valid at points X where the gradient is zero. Furthermore, if some entries of X are very close to zero, this may cause numerical difficulties that can also lead to a failed slope test. More generally, it is important the the solution of the optimization problem should have positive entries, sufficiently far away from zero to avoid numerical issues. Link to the paper: https://arxiv.org/abs/1802.02628. Please cite the Manopt paper as well as the research paper: @Techreport{Douik2018Manifold, Title = {Manifold Optimization Over the Set of Doubly Stochastic Matrices: {A} Second-Order Geometry}, Author = {Douik, A. and Hassibi, B.}, Journal = {Arxiv preprint ArXiv:1802.02628}, Year = {2018} } See also: multinomialdoublystochasticfactory multinomialfactory
0001 function M = multinomialsymmetricfactory(n) 0002 % Manifold of n-by-n symmetric stochastic matrices with positive entries. 0003 % 0004 % function M = multinomialsymmetricfactory(n) 0005 % 0006 % M is a Manopt manifold structure to optimize over the set of n-by-n 0007 % symmetric matrices with (strictly) positive entries and such that the 0008 % entries of each column and each row sum to one. 0009 % 0010 % Points on the manifold and tangent vectors are represented naturally as 0011 % symmetric matrices of size n. The Riemannian metric imposed on the 0012 % manifold is the Fisher metric, that is, if X is a point on the manifold 0013 % and U, V are two tangent vectors: 0014 % 0015 % M.inner(X, U, V) = <U, V>_X = sum(sum(U.*V./X)). 0016 % 0017 % The retraction here provided is only first order. Consequently, the 0018 % slope test in the checkhessian tool is only valid at points X where the 0019 % gradient is zero. Furthermore, if some entries of X are very close to 0020 % zero, this may cause numerical difficulties that can also lead to a 0021 % failed slope test. More generally, it is important the the solution of 0022 % the optimization problem should have positive entries, sufficiently far 0023 % away from zero to avoid numerical issues. 0024 % 0025 % Link to the paper: https://arxiv.org/abs/1802.02628. 0026 % 0027 % Please cite the Manopt paper as well as the research paper: 0028 % @Techreport{Douik2018Manifold, 0029 % Title = {Manifold Optimization Over the Set of Doubly Stochastic 0030 % Matrices: {A} Second-Order Geometry}, 0031 % Author = {Douik, A. and Hassibi, B.}, 0032 % Journal = {Arxiv preprint ArXiv:1802.02628}, 0033 % Year = {2018} 0034 % } 0035 % 0036 % See also: multinomialdoublystochasticfactory multinomialfactory 0037 0038 % This file is part of Manopt: www.manopt.org. 0039 % Original author: Ahmed Douik, March 06, 2018. 0040 % Contributors: 0041 % Change log: 0042 % 0043 % Sep. 6, 2018 (NB): 0044 % Removed M.exp() as it was not implemented. 0045 % 0046 % Jan. 4, 2021 (NB): 0047 % Compatibility with Octave 6.1.0, at the cost of having duplicated 0048 % the definition of maxDSiters and of having replaced all occurrences 0049 % of 'e' with its formerly defined value, namely, ones(n, 1). 0050 0051 M.name = @() sprintf('%dx%d symmetric doubly-stochastic matrices with positive entries', n, n); 0052 0053 M.dim = @() n*(n-1)/2; 0054 0055 % Fisher metric 0056 M.inner = @iproduct; 0057 function ip = iproduct(X, eta, zeta) 0058 ip = sum((eta(:).*zeta(:))./X(:)); 0059 end 0060 0061 M.norm = @(X, eta) sqrt(M.inner(X, eta, eta)); 0062 0063 M.dist = @(X, Y) error('multinomialsymmetricfactory.dist not implemented yet.'); 0064 0065 % The manifold is not compact as a result of the choice of the metric, 0066 % thus any choice here is arbitrary. This is notably used to pick 0067 % default values of initial and maximal trust-region radius in the 0068 % trustregions solver. 0069 M.typicaldist = @() n; 0070 0071 % Pick a random point on the manifold 0072 M.rand = @random; 0073 function X = random() 0074 maxDSiters = 100 + 2*n; 0075 Z = symm(abs(randn(n, n))); % Random point in the ambient space 0076 X = symm(doubly_stochastic(Z, maxDSiters)); % Projection on the manifold 0077 end 0078 0079 % Pick a random vector in the tangent space at X, of norm 1 0080 M.randvec = @randomvec; 0081 function eta = randomvec(X) 0082 % A random vector in the ambient space 0083 Z = symm(randn(n, n)) ; 0084 % Projection to the tangent space 0085 alpha = sum((eye(n) + X)\(Z),2) ; 0086 eta = Z - (alpha*ones(n, 1)' + ones(n, 1)*alpha').*X ; 0087 % Normalizing the vector 0088 nrm = M.norm(X, eta); 0089 eta = eta / nrm; 0090 end 0091 0092 % Orthogonal projection of the vector eta in the ambient space to the 0093 % tangent space. 0094 M.proj = @projection; 0095 function etaproj = projection(X, eta) 0096 alpha = sum((eye(n) + X)\(eta), 2); 0097 etaproj = eta - (alpha*ones(n, 1)' + ones(n, 1)*alpha').*X; 0098 end 0099 0100 M.tangent = M.proj; 0101 M.tangent2ambient = @(X, eta) eta; 0102 0103 % Conversion of Euclidean to Riemannian gradient 0104 M.egrad2rgrad = @egrad2rgrad; 0105 function rgrad = egrad2rgrad(X, egrad) 0106 mu = sum((X.*egrad), 2); 0107 alpha = (eye(n) + X)\mu; 0108 rgrad = (egrad - alpha*ones(n, 1)' - ones(n, 1)*alpha').*X; 0109 end 0110 0111 % First-order retraction 0112 M.retr = @retraction; 0113 function Y = retraction(X, eta, t) 0114 if nargin < 3 0115 t = 1.0; 0116 end 0117 maxDSiters = 100 + 2*n; 0118 Y = X.*exp(t*(eta./X)); 0119 Y = symm(doubly_stochastic(Y, maxDSiters)); 0120 Y = max(Y, eps); 0121 end 0122 0123 % Conversion of Euclidean to Riemannian Hessian 0124 M.ehess2rhess = @ehess2rhess; 0125 function rhess = ehess2rhess(X, egrad, ehess, eta) 0126 0127 % Computing the directional derivative of the Riemannian 0128 % gradient 0129 gamma = egrad.*X ; 0130 gammadot = ehess.*X + egrad.*eta; 0131 alpha = sum((eye(n) + X)\(gamma), 2); 0132 m = (eye(n)+X)\eta; 0133 alphadot = sum((eye(n) + X)\(gammadot - m*gamma), 2); 0134 S = (alpha*ones(n, 1)' + ones(n, 1)*alpha'); 0135 deltadot = gammadot - (alphadot*ones(n, 1)' + ones(n, 1)*alphadot').*X - S.*eta; 0136 0137 % Projecting gamma 0138 delta = gamma - S.*X; 0139 0140 % Computing and projecting nabla 0141 nabla = deltadot - 0.5*(delta.*eta)./X; 0142 w = sum((eye(n) + X)\(nabla), 2); 0143 rhess = nabla - (w*ones(n, 1)' + ones(n, 1)*w').*X; 0144 0145 end 0146 0147 0148 % Miscellaneous manifold functions 0149 M.hash = @(X) ['z' hashmd5(X(:))]; 0150 M.lincomb = @matrixlincomb; 0151 M.zerovec = @(X) zeros(n, n); 0152 M.vec = @(X, U) U(:); 0153 M.mat = @(X, u) reshape(u, n, n); 0154 M.vecmatareisometries = @() false; 0155 0156 M.transp = @transp; 0157 function U = transp(X1, X2, d) 0158 U = projection(X2, d); 0159 end 0160 0161 end 0162 0163 function A = symm(Z) 0164 A = .5*(Z+Z'); 0165 end