Home > manopt > manifolds > multinomial > multinomialdoublystochasticfactory.m

# multinomialdoublystochasticfactory

## PURPOSE

Manifold of n-by-n doubly-stochastic matrices with positive entries.

## SYNOPSIS

function M = multinomialdoublystochasticfactory(n)

## DESCRIPTION

``` Manifold of n-by-n doubly-stochastic matrices with positive entries.

function M = multinomialdoublystochasticfactory(n)

M is a Manopt manifold structure to optimize over the set of n-by-n
matrices with (strictly) positive entries and such that the entries of
each column and each row sum to one.

Points on the manifold and tangent vectors are represented naturally as
matrices of size n. The Riemannian metric imposed on the manifold is the
Fisher metric, that is, if X is a point on the manifold and U, V are two
tangent vectors:

M.inner(X, U, V) = <U, V>_X = sum(sum(U.*V./X)).

The retraction here provided is only first order. Consequently, the
slope test in the checkhessian tool is only valid at points X where the
gradient is zero. Furthermore, if some entries of X are very close to
zero, this may cause numerical difficulties that can also lead to a
failed slope test. More generally, it is important that the solution of
the optimization problem should have positive entries, sufficiently far
away from zero to avoid numerical issues.

The file is based on developments in the research paper
A. Douik and B. Hassibi, "Manifold Optimization Over the Set
of Doubly Stochastic Matrices: A Second-Order Geometry"
ArXiv:1802.02628, 2018.

Link to the paper: https://arxiv.org/abs/1802.02628.

Please cite the Manopt paper as well as the research paper:
@Techreport{Douik2018Manifold,
Title   = {Manifold Optimization Over the Set of Doubly Stochastic
Matrices: {A} Second-Order Geometry},
Author  = {Douik, A. and Hassibi, B.},
Journal = {Arxiv preprint ArXiv:1802.02628},
Year    = {2018}
}

## CROSS-REFERENCE INFORMATION

This function calls:
• doubly_stochastic Project a matrix to the doubly stochastic matrices (Sinkhorn's algorithm)
• hashmd5 Computes the MD5 hash of input data.
• matrixlincomb Linear combination function for tangent vectors represented as matrices.
This function is called by:

## SOURCE CODE

```0001 function M = multinomialdoublystochasticfactory(n)
0002 % Manifold of n-by-n doubly-stochastic matrices with positive entries.
0003 %
0004 % function M = multinomialdoublystochasticfactory(n)
0005 %
0006 % M is a Manopt manifold structure to optimize over the set of n-by-n
0007 % matrices with (strictly) positive entries and such that the entries of
0008 % each column and each row sum to one.
0009 %
0010 % Points on the manifold and tangent vectors are represented naturally as
0011 % matrices of size n. The Riemannian metric imposed on the manifold is the
0012 % Fisher metric, that is, if X is a point on the manifold and U, V are two
0013 % tangent vectors:
0014 %
0015 %     M.inner(X, U, V) = <U, V>_X = sum(sum(U.*V./X)).
0016 %
0017 % The retraction here provided is only first order. Consequently, the
0018 % slope test in the checkhessian tool is only valid at points X where the
0019 % gradient is zero. Furthermore, if some entries of X are very close to
0020 % zero, this may cause numerical difficulties that can also lead to a
0021 % failed slope test. More generally, it is important that the solution of
0022 % the optimization problem should have positive entries, sufficiently far
0023 % away from zero to avoid numerical issues.
0024 %
0025 % The file is based on developments in the research paper
0026 % A. Douik and B. Hassibi, "Manifold Optimization Over the Set
0027 % of Doubly Stochastic Matrices: A Second-Order Geometry"
0028 % ArXiv:1802.02628, 2018.
0029 %
0030 % Link to the paper: https://arxiv.org/abs/1802.02628.
0031 %
0032 % Please cite the Manopt paper as well as the research paper:
0033 % @Techreport{Douik2018Manifold,
0034 %   Title   = {Manifold Optimization Over the Set of Doubly Stochastic
0035 %              Matrices: {A} Second-Order Geometry},
0036 %   Author  = {Douik, A. and Hassibi, B.},
0037 %   Journal = {Arxiv preprint ArXiv:1802.02628},
0038 %   Year    = {2018}
0039 % }
0040 %
0042
0043 % This file is part of Manopt: www.manopt.org.
0044 % Original author: Ahmed Douik, March 6, 2018.
0045 % Contributors: Nicolas Boumal
0046 % Change log:
0047 %
0048 %    Apr. 24, 2018 (AD):
0049 %        Changed pinv() to a particular solution to the equation.
0050 %
0051 %    July 24, 2018 (AD):
0052 %        A bugfix related to the pinv() change, with effects in many places.
0053 %
0054 %    Sep.  6, 2018 (NB):
0055 %        Removed M.exp() as it was not implemented.
0056 %
0057 %    Aug. 19, 2019 (AD, BM, NB):
0058 %        Fixed typos in comments; replaced some linear solves with pcg
0059 %        for efficiency when n is large. By default, pcg is used:
0060 %        comments in the code indicate other possibilities and how they
0061 %        differ. Added maxDSiters to doubly_stochastic argument.
0062 %        The main change has been to factor out these linear solves.
0063
0064     e = ones(n, 1);
0065
0066     maxDSiters = 100 + 2*n;
0067
0068     function [alpha, beta] = mylinearsolve(X, b)
0069         % zeta = sparse(A)\b; % sparse might not be better perf.-wise.
0070         % where A = [eye(n) X ; X' eye(n)];
0071         %
0072         % For large n use the pcg solver instead of \.
0073         % [zeta, ~, ~, iter] = pcg(sparse(A), b, 1e-6, 100);
0074         %
0075         % Even faster is to create a function handle
0076         % computing A*x (x is a given vector).
0077         % Make sure that A is not created, and X is only
0078         % passed with mylinearsolve and not A.
0079         [zeta, ~, ~, iter] = pcg(@mycompute, b, 1e-6, 100);
0080
0081         function Ax = mycompute(x)
0082             xtop = x(1:n,1);
0083             xbottom = x(n+1:end,1);
0084             Axtop = xtop + X*xbottom;
0085             Axbottom = X'*xtop + xbottom;
0086             Ax = [Axtop; Axbottom];
0087         end
0088
0089         alpha = zeta(1:n, 1);
0090         beta = zeta(n+1:end, 1);
0091     end
0092
0093     M.name = @() sprintf('%dx%d doubly-stochastic matrices with positive entries', n, n);
0094
0095     M.dim = @() (n-1)^2;
0096
0097     % Fisher metric
0098     M.inner = @iproduct;
0099     function ip = iproduct(X, eta, zeta)
0100         ip = sum((eta(:).*zeta(:))./X(:));
0101     end
0102
0103     M.norm = @(X, eta) sqrt(M.inner(X, eta, eta));
0104
0105     M.dist = @(X, Y) error('multinomialdoublystochasticfactory.dist not implemented yet.');
0106
0107     % The manifold is not compact as a result of the choice of the metric,
0108     % thus any choice here is arbitrary. This is notably used to pick
0109     % default values of initial and maximal trust-region radius in the
0110     % trustregions solver.
0111     M.typicaldist = @() n;
0112
0113     % Pick a random point on the manifold
0114     M.rand = @random;
0115     function X = random()
0116         Z = abs(randn(n, n));     % Random point in the ambient space
0117         X = doubly_stochastic(Z, maxDSiters); % Projection on the Manifold
0118     end
0119
0120     % Pick a random vector in the tangent space at X.
0121     M.randvec = @randomvec;
0122     function eta = randomvec(X) % A random vector in the tangent space
0123         % A random vector in the ambient space
0124         Z = randn(n, n);
0125         % Projection of the vector onto the tangent space
0126         b = [sum(Z, 2) ; sum(Z, 1)'];
0127         [alpha, beta] = mylinearsolve(X, b);
0128         eta = Z - (alpha*e' + e*beta').*X;
0129         % Normalizing the vector
0130         nrm = M.norm(X, eta);
0131         eta = eta / nrm;
0132     end
0133
0134     % Projection of vector eta in the ambient space to the tangent space.
0135     M.proj = @projection;
0136     function etaproj = projection(X, eta) % Projection of the vector eta in the ambeint space onto the tangent space
0137         b = [sum(eta, 2) ; sum(eta, 1)'];
0138         [alpha, beta] = mylinearsolve(X, b);
0139         etaproj = eta - (alpha*e' + e*beta').*X;
0140     end
0141
0142     M.tangent = M.proj;
0143     M.tangent2ambient = @(X, eta) eta;
0144
0145     % Conversion of Euclidean to Riemannian gradient
0148         mu = (X.*egrad);
0149         b = [sum(mu, 2) ; sum(mu, 1)'];
0150         [alpha, beta] = mylinearsolve(X, b);
0151         rgrad = mu - (alpha*e' + e*beta').*X;
0152     end
0153
0154     % First-order retraction
0155     M.retr = @retraction;
0156     function Y = retraction(X, eta, t)
0157         if nargin < 3
0158             t = 1.0;
0159         end
0160         Y = X.*exp(t*(eta./X));
0161         Y = doubly_stochastic(Y, maxDSiters);
0162         Y = max(Y, eps);
0163     end
0164
0165     % Conversion of Euclidean to Riemannian Hessian
0166     M.ehess2rhess = @ehess2rhess;
0167     function rhess = ehess2rhess(X, egrad, ehess, eta)
0168
0169         % computing the directional derivative of the Riemannian
0171         gamma = egrad.*X;
0173
0174         b = [sum(gamma, 2) ; sum(gamma, 1)'];
0175         bdot = [sum(gammadot, 2) ; sum(gammadot, 1)'];
0176         [alpha, beta] = mylinearsolve(X, b);
0177         [alphadot, betadot] = mylinearsolve(X, bdot- [eta*beta; eta'*alpha]);
0178
0179         S = (alpha*e' + e*beta');
0181
0182         % projecting gamma
0183         delta = gamma - S.*X;
0184
0185         % computing and projecting nabla
0186         nabla = deltadot - 0.5*(delta.*eta)./X;
0187         rhess = projection(X, nabla);
0188     end
0189
0190
0191     % Miscellaneous manifold functions
0192     M.hash = @(X) ['z' hashmd5(X(:))];
0193     M.lincomb = @matrixlincomb;
0194     M.zerovec = @(X) zeros(n, n);
0195     M.transp = @(X1, X2, d) projection(X2, d);
0196     M.vec = @(X, U) U(:);
0197     M.mat = @(X, u) reshape(u, n, n);
0198     M.vecmatareisometries = @() false;
0199
0200 end```

Generated on Fri 30-Sep-2022 13:18:25 by m2html © 2005