The doubly stochastic multinomial manifold $\mathcal{DP}_n$ (the set of entry-wise positive matrices of size nxn with columns and rows summing to 1) is endowed with a Riemannian manifold structure by considering it as a Riemannian submanifold of the embedding Euclidean space $\mathbb{R}^{n\times n}$ endowed with the Fisher information as inner product $\langle\xi_\mathbf{X},\eta_\mathbf{X}\rangle_\mathbf{X} = \sum\limits_{i=1}^n \sum\limits_{j=1}^n \cfrac{(\xi_\mathbf{X})_{ij} (\eta_\mathbf{X})_{ij}}{\mathbf{X}_{ij}}$. The geometry of the doubly stochastic multinomial manifold is available in the research paper [DH18].

Factory call: `M = multinomialdoublystochasticfactory(n)`

.

There is also a symmetric version of this factory: see `multinomialsymmetricfactory`

.

Name | Formula | Numerical representation |

Set | $\mathcal{DP}_n = \left\{ \mathbf{X} \in \mathbb{R}^{n \times n} \big| \mathbf{X}_{ij} > 0,\ \mathbf{X}\mathbf{1}=\mathbf{1},\ \mathbf{X}^T\mathbf{1}=\mathbf{1} \right\}$ | $X$ is represented as a matrix `X` of size nxn whose columns
and rows sum to 1, i.e., ```
sum(X(:, i))= sum(X(i,:)) =
1
``` for `i = 1:n ` . |

Tangent space at $X$ | $\mathcal{T}_{\mathbf{X}}\mathcal{D}\mathcal{P}_{n} = \left\{\mathbf{Z} \in \mathbb{R}^{n \times n} \big| \mathbf{Z}\mathbf{1}=\mathbf{0},\ \mathbf{Z}^T\mathbf{1}=\mathbf{0} \right\}$ | A tangent vector $Z$ at $X$ is represented as a matrix `Z`
of size nxn
such that each column and row of `Z` sum to 0,
i.e., `sum(Z(:, i))= sum(Z(i,:)) = 0` for ```
i
= 1:n
``` . |

Ambient space | $\mathbb{R}^{n\times n}$ | Points and vectors in the ambient space are, naturally, represented as matrices of size nxn. |

The following table shows some of the nontrivial available
functions in the structure `M`

. The norm $\|\cdot\|$
refers to the norm in the ambient space, which is the Frobenius
norm. The tutorial page
gives more details about the functionality implemented by each
function.

Name | Field usage | Formula |

Dimension | `M.dim() ` |
$\operatorname{dim}\mathcal{M} = (n-1)^2$ |

Metric | `M.inner(X, U, V) ` |
$\langle U, V\rangle_X = \operatorname{trace}(U^T (V \oslash X)) = \sum\limits_{i,j=1}^n \cfrac{U_{ij} V_{ij}}{X_{ij}}$ |

Norm | `M.norm(X, U) ` |
$\|U\|_X = \sqrt{\langle U, U \rangle_X}$ |

Distance | `M.dist(X, Y) ` |
Not implemented |

Typical distance | `M.typicaldist() ` |
$n$ |

Tangent space projector | `M.proj(X, H) ` |
$P_X(H)$ represents the orthogonal projection, in the Fisher information inner product sens, of the ambient point H onto the tangent space $\mathcal{T}_{\mathbf{X}}\mathcal{D}\mathcal{P}_{n}$. |

Euclidean to Riemannian gradient | `M.egrad2rgrad(X, egrad)` |
$\operatorname{grad} f(X) = P_X(\nabla f(X) \odot X)$,
where `egrad` represents the Euclidean gradient
$\nabla f(X)$, which is a vector in the ambient space. |

Euclidean to Riemannian Hessian | `M.ehess2rhess(X, egrad, ehess, U)` |
$\operatorname{Hess} f(X)[U]$ represents the Riemannian
gradient, where `egrad` represents the Euclidean
gradient $\nabla f(X)$ and `ehess` represents
the Euclidean Hessian $\nabla^2 f(X)[U]$, both being vectors
in the ambient space. |

Retraction | `M.retr(X, U, t) ` |
$\operatorname{Retr}_X(tU)$ retracts the tangent vector U to the manifold, i.e., U is a doubly stochastic matrix. |

Random point | `M.rand() ` |
Returns a point uniformly at random from the set of doubly stochastic matrices. |

Random vector | `M.randvec(X) ` |
Returns a unit-norm tangent vector at $X$ with uniformly random direction, obtained as follows: generate $H$ with i.i.d. normal entries; return: $U = P_X(H) / \|P_X(H)\|$. |

Vector transport | `M.transp(X, Y, U) ` |
$\operatorname{Transp}_{Y\leftarrow X}(U) = P_Y(U)$, where $U$ is a tangent vector at $X$ that is transported to the tangent space at $Y$. |

Let $A\in\mathbb{R}^{n\times n}$ be a noisy version of a doubly stochastic matrix. We search for the doubly stochastic matrix that is closest to $A$ according to the Frobenius norm. We minimize the following cost function:

$$f(X) = \frac{1}{2} \|X-A\|^2,$$

such that $X \in \mathcal{DP}_n$. Compute the Euclidean gradient and Hessian of $f$:

$$\nabla f(X) = X-A,$$

$$\nabla^2 f(X)[U] = U.$$

The Riemannian gradient and Hessian are computed automatically from these.

% Generate the problem data. n = 100 ; % Size of the matrix sigma = 1/n^2 ; % Noise standard deviation at each entry % Generate a doubly stochastic matrix using the Sinkhorn algorithm B = doubly_stochastic(abs(randn(n, n))) ; % Adding noise to the matrix A = max(B + sigma*randn(n, n),0.01) ; % Denoising cost function and derivatives cost = @(X) 0.5*norm(A-X, 'fro')^2; egrad = @(X) X - A; % Euclidean Gradient ehess = @(X, U) U; % Euclidean Hessian % Manifold initialization manifold = multinomialdoublystochasticfactory(n) ; problem.M = manifold; problem.cost = cost; problem.egrad = egrad; problem.ehess = ehess; % Numerically check the differentials. checkgradient(problem); pause; checkhessian(problem); pause; % Since the retraction is only first-order, the slope test is valid only at critical points.

X = trustregions(problem); % Solve the problem with a local optimization algorithm

For theory on Riemannian submanifolds, see [AMS08], section 3.6.1 (first-order derivatives) and section 5.3.3 (second-order derivatives, i.e., connections).

For content specifically about the doubly stochastic multinomial manifold with applications, see [DH18].

[AMS08] P.-A. Absil, R. Mahony and R. Sepulchre, Optimization Algorithms
on Matrix Manifolds, Princeton University Press, 2008.

[DH18] A. Douik and B. Hassibi,
Manifold Optimization Over the Set of Doubly Stochastic
Matrices: A Second-Order Geometry, 2018.