The symmetric stochastic multinomial manifold $\mathcal{SP}_n$ (the set of symmetric entry-wise positive matrices of size nxn with columns and rows summing to 1) is endowed with a Riemannian manifold structure by considering it as a Riemannian submanifold of the embedding Euclidean space $\mathbb{R}^{n\times n}$ endowed with the Fisher information as inner product $\langle\xi_\mathbf{X},\eta_\mathbf{X}\rangle_\mathbf{X} = \sum\limits_{i=1}^n \sum\limits_{j=1}^n \cfrac{(\xi_\mathbf{X})_{ij} (\eta_\mathbf{X})_{ij}}{\mathbf{X}_{ij}}$. The geometry of the symmetric stochastic multinomial manifold is available in the research paper [DH18].

Factory call: `M = multinomialsymmetricfactory(n)`

.

There is also a non-symmetric version of this factory: see `multinomialdoublystochasticfactory`

.

Name | Formula | Numerical representation |

Set | $\mathcal{SP}_n = \left\{ \mathbf{X} \in \mathbb{R}^{n \times n} \big| \mathbf{X}_{ij} > 0,\ \mathbf{X}\mathbf{1}=\mathbf{1},\ \mathbf{X}=\mathbf{X}^T \right\}$ | $X$ is represented as a symmetric matrix `X`
of size nxn
whose columns and rows sum to 1, i.e., ```
sum(X(:, i))=
1
``` for `i = 1:n ` and `X= X'` . |

Tangent space at $X$ | $\mathcal{T}_{\mathbf{X}}\mathcal{S}\mathcal{P}_{n} = \left\{\mathbf{Z} \in \mathbb{R}^{n \times n} \big| \mathbf{Z}\mathbf{1}=\mathbf{0},\ \mathbf{Z}=\mathbf{Z}^T \right\}$ | A tangent vector $Z$ at $X$ is represented as a symetric
matrix `Z` of size nxn
such that each column and row of `Z`
sum to 0, i.e., `sum(Z(:, i))= 0` for ```
i =
1:n
``` and `Z= Z'` . |

Ambient space | $\mathbb{R}^{n\times n}$ | Points and vectors in the ambient space are, naturally, represented as matrices of size nxn. |

The following table shows some of the nontrivial available
functions in the structure `M`

. The norm $\|\cdot\|$
refers to the norm in the ambient space, which is the Frobenius
norm. The tutorial page
gives more details about the functionality implemented by each
function.

Name | Field usage | Formula |

Dimension | `M.dim() ` |
$\operatorname{dim}\mathcal{M} = \frac{n(n-1)}{2}$ |

Metric | `M.inner(X, U, V) ` |
$\langle U, V\rangle_X = \operatorname{trace}(U^T (V \oslash X)) = \sum\limits_{i,j=1}^n \cfrac{U_{ij} V_{ij}}{X_{ij}}$ |

Norm | `M.norm(X, U) ` |
$\|U\|_X = \sqrt{\langle U, U \rangle_X}$ |

Distance | `M.dist(X, Y) ` |
Not implemented |

Typical distance | `M.typicaldist() ` |
$n$ |

Tangent space projector | `M.proj(X, H) ` |
$P_X(H)$ represents the orthogonal projection, in the Fisher information inner product sens, of the ambient point H onto the tangent space $\mathcal{T}_{\mathbf{X}}\mathcal{S}\mathcal{P}_{n}$. |

Euclidean to Riemannian gradient | `M.egrad2rgrad(X, egrad)` |
$\operatorname{grad} f(X) = P_X(\nabla f(X) \odot X)$,
where `egrad` represents the Euclidean gradient
$\nabla f(X)$, which is a vector in the ambient space. |

Euclidean to Riemannian Hessian | `M.ehess2rhess(X, egrad, ehess, U)` |
$\operatorname{Hess} f(X)[U]$ represents the Riemannian
gradient, where `egrad` represents the Euclidean
gradient $\nabla f(X)$ and `ehess` represents
the Euclidean Hessian $\nabla^2 f(X)[U]$, both being vectors
in the ambient space. |

Retraction | `M.retr(X, U, t) ` |
$\operatorname{Retr}_X(tU)$ retracts the tangent vector U to the manifold, i.e., U is a doubly stochastic matrix. |

Random point | `M.rand() ` |
Returns a point uniformly at random from the set of symmetric stochastic matrices. |

Random vector | `M.randvec(X) ` |
Returns a unit-norm tangent vector at $X$ with uniformly random direction, obtained as follows: generate $H$ with i.i.d. normal entries; return: $U = P_X(H) / \|P_X(H)\|$. |

Vector transport | `M.transp(X, Y, U) ` |
$\operatorname{Transp}_{Y\leftarrow X}(U) = P_Y(U)$, where $U$ is a tangent vector at $X$ that is transported to the tangent space at $Y$. |

Let $A\in\mathbb{R}^{n\times n}$ be a noisy version of a symmetric stochastic matrix. We search for the symmetric stochastic matrix that is closest to $A$ according to the Frobenius norm. We minimize the following cost function:

$$f(X) = \frac{1}{2} \|X-A\|^2,$$

such that $X \in \mathcal{SP}_n$. Compute the Euclidean gradient and Hessian of $f$ (important: this is assuming $A$ is symmetric!):

$$\nabla f(X) = X-A,$$

$$\nabla^2 f(X)[U] = U.$$

The Riemannian gradient and Hessian are computed automatically from these.

% Generate the problem data. n = 100; % Size of the matrix sigma = 1/n^2; % Noise standard deviation at each entry symm = @(X) .5*(X+X'); % Inline function to make a matrix symmetric % Generate a doubly stochastic matrix using the Sinkhorn algorithm A = symm(doubly_stochastic(abs(randn(n, n)))); % Adding noise to the matrix A = max(A + sigma*symm(randn(n, n)), 0.01); % Denoising function and derivatives cost = @(X) 0.5*norm( A-X, 'fro')^2; % Cost function egrad = @(X) X - A; % Euclidean Gradient ehess = @(X, U) U; % Euclidean Hessian % Manifold initialization manifold = multinomialsymmetricfactory(n); problem.M = manifold; problem.cost = cost; problem.egrad = egrad; problem.ehess = ehess; % Numerically check the differentials. checkgradient(problem); pause; checkhessian(problem); pause; % Since the retraction is only first-order, the slope test is valid only at critical points.

X = trustregions(problem); % Solve the problem with a local optimization algorithm

For theory on Riemannian submanifolds, see [AMS08], section 3.6.1 (first-order derivatives) and section 5.3.3 (second-order derivatives, i.e., connections).

For content specifically about the symmetric stochastic multinomial manifold with applications, see [DH18].

[AMS08] P.-A. Absil, R. Mahony and R. Sepulchre, Optimization Algorithms
on Matrix Manifolds, Princeton University Press, 2008.

[DH18] A. Douik and B. Hassibi,
Manifold Optimization Over the Set of Doubly Stochastic
Matrices: A Second-Order Geometry, 2018.