Manopt, symmetric stochastic multinomial manifold

The symmetric stochastic multinomial manifold $\mathcal{SP}_n$ (the set of symmetric entry-wise positive matrices of size nxn with columns and rows summing to 1) is endowed with a Riemannian manifold structure by considering it as a Riemannian submanifold of the embedding Euclidean space $\mathbb{R}^{n\times n}$ endowed with the Fisher information as inner product $\langle\xi_\mathbf{X},\eta_\mathbf{X}\rangle_\mathbf{X} = \sum\limits_{i=1}^n \sum\limits_{j=1}^n \cfrac{(\xi_\mathbf{X})_{ij} (\eta_\mathbf{X})_{ij}}{\mathbf{X}_{ij}}$. The geometry of the symmetric stochastic multinomial manifold is available in the research paper [DH18].

Factory call: M = multinomialsymmetricfactory(n).

There is also a non-symmetric version of this factory: see multinomialdoublystochasticfactory.

Name	Formula	Numerical representation
Set	$\mathcal{SP}_n = \left\{ \mathbf{X} \in \mathbb{R}^{n \times n} \big\| \mathbf{X}_{ij} > 0,\ \mathbf{X}\mathbf{1}=\mathbf{1},\ \mathbf{X}=\mathbf{X}^T \right\}$	$X$ is represented as a symmetric matrix `X` of size nxn whose columns and rows sum to 1, i.e., `sum(X(:, i))= 1` for `i = 1:n` and `X= X'`.
Tangent space at $X$	$\mathcal{T}_{\mathbf{X}}\mathcal{S}\mathcal{P}_{n} = \left\{\mathbf{Z} \in \mathbb{R}^{n \times n} \big\| \mathbf{Z}\mathbf{1}=\mathbf{0},\ \mathbf{Z}=\mathbf{Z}^T \right\}$	A tangent vector $Z$ at $X$ is represented as a symetric matrix `Z` of size nxn such that each column and row of `Z` sum to 0, i.e., `sum(Z(:, i))= 0` for `i = 1:n` and `Z= Z'`.
Ambient space	$\mathbb{R}^{n\times n}$	Points and vectors in the ambient space are, naturally, represented as matrices of size nxn.

The following table shows some of the nontrivial available functions in the structure M. The norm $\|\cdot\|$ refers to the norm in the ambient space, which is the Frobenius norm. The tutorial page gives more details about the functionality implemented by each function.

Name	Field usage	Formula
Dimension	`M.dim()`	$\operatorname{dim}\mathcal{M} = \frac{n(n-1)}{2}$
Metric	`M.inner(X, U, V)`	$\langle U, V\rangle_X = \operatorname{trace}(U^T (V \oslash X)) = \sum\limits_{i,j=1}^n \cfrac{U_{ij} V_{ij}}{X_{ij}}$
Norm	`M.norm(X, U)`	$\\|U\\|_X = \sqrt{\langle U, U \rangle_X}$
Distance	`M.dist(X, Y)`	Not implemented
Typical distance	`M.typicaldist()`	$n$
Tangent space projector	`M.proj(X, H)`	$P_X(H)$ represents the orthogonal projection, in the Fisher information inner product sens, of the ambient point H onto the tangent space $\mathcal{T}_{\mathbf{X}}\mathcal{S}\mathcal{P}_{n}$.
Euclidean to Riemannian gradient	`M.egrad2rgrad(X, egrad)`	$\operatorname{grad} f(X) = P_X(\nabla f(X) \odot X)$, where `egrad` represents the Euclidean gradient $\nabla f(X)$, which is a vector in the ambient space.
Euclidean to Riemannian Hessian	`M.ehess2rhess(X, egrad, ehess, U)`	$\operatorname{Hess} f(X)[U]$ represents the Riemannian gradient, where `egrad` represents the Euclidean gradient $\nabla f(X)$ and `ehess` represents the Euclidean Hessian $\nabla^2 f(X)[U]$, both being vectors in the ambient space.
Retraction	`M.retr(X, U, t)`	$\operatorname{Retr}_X(tU)$ retracts the tangent vector U to the manifold, i.e., U is a doubly stochastic matrix.
Random point	`M.rand()`	Returns a point uniformly at random from the set of symmetric stochastic matrices.
Random vector	`M.randvec(X)`	Returns a unit-norm tangent vector at $X$ with uniformly random direction, obtained as follows: generate $H$ with i.i.d. normal entries; return: $U = P_X(H) / \\|P_X(H)\\|$.
Vector transport	`M.transp(X, Y, U)`	$\operatorname{Transp}_{Y\leftarrow X}(U) = P_Y(U)$, where $U$ is a tangent vector at $X$ that is transported to the tangent space at $Y$.

Let $A\in\mathbb{R}^{n\times n}$ be a noisy version of a symmetric stochastic matrix. We search for the symmetric stochastic matrix that is closest to $A$ according to the Frobenius norm. We minimize the following cost function:

$$f(X) = \frac{1}{2} \|X-A\|^2,$$

such that $X \in \mathcal{SP}_n$. Compute the Euclidean gradient and Hessian of $f$ (important: this is assuming $A$ is symmetric!):

$$\nabla f(X) = X-A,$$

$$\nabla^2 f(X)[U] = U.$$

The Riemannian gradient and Hessian are computed automatically from these.

% Generate the problem data.

n = 100; % Size of the matrix
sigma = 1/n^2; % Noise standard deviation at each entry

symm = @(X) .5*(X+X'); % Inline function to make a matrix symmetric

% Generate a doubly stochastic matrix using the Sinkhorn algorithm
A = symm(doubly_stochastic(abs(randn(n, n))));
% Adding noise to the matrix
A = max(A + sigma*symm(randn(n, n)), 0.01);

% Denoising function and derivatives
cost = @(X) 0.5*norm( A-X, 'fro')^2; % Cost function
egrad = @(X) X - A;  % Euclidean Gradient
ehess = @(X, U) U; % Euclidean Hessian

% Manifold initialization
manifold = multinomialsymmetricfactory(n);
problem.M = manifold;
problem.cost = cost;
problem.egrad = egrad;
problem.ehess = ehess;

% Numerically check the differentials.
checkgradient(problem); pause;
checkhessian(problem); pause; % Since the retraction is only first-order, the slope test is valid only at critical points.

X = trustregions(problem); % Solve the problem with a local optimization algorithm

For theory on Riemannian submanifolds, see [AMS08], section 3.6.1 (first-order derivatives) and section 5.3.3 (second-order derivatives, i.e., connections).

For content specifically about the symmetric stochastic multinomial manifold with applications, see [DH18].

[AMS08] P.-A. Absil, R. Mahony and R. Sepulchre, Optimization Algorithms on Matrix Manifolds, Princeton University Press, 2008.
[DH18] A. Douik and B. Hassibi, Manifold Optimization Over the Set of Doubly Stochastic Matrices: A Second-Order Geometry, 2018.

A Riemannian geometry for the symmetric stochastic multinomial manifold

Toolset

Example

References