The sphere $\mathbb{S}^{nm-1}$ (the set of unit Frobenius norm matrices of size nxm) is endowed with a Riemannian manifold structure by considering it as a Riemannian submanifold of the embedding Euclidean space $\mathbb{R}^{n\times m}$ endowed with the usual inner product $\langle H_1, H_2 \rangle = \operatorname{trace}(H_1^T H_2)$.

Factory call: `M = spherefactory(n, m)`

. By default, `m`

equals 1. If you need to work with several vectors of unit norm, have a look at the oblique manifold.

Name | Formula | Numerical representation |

Set | $\mathbb{S}^{nm-1} = \{ X \in \mathbb{R}^{n\times m} : \|X\| = 1 \}$ | $X$ is represented as a matrix `X` of size nxm with unit Frobenius norm, i.e., `X(:).'*X(:) = 1` . |

Tangent space at $X$ | $T_X \mathbb{S}^{nm-1} = \{ U \in \mathbb{R}^{n\times m} : \operatorname{trace}(X^T U) = 0 \}$ | A tangent vector $U$ at $X$ is represented as a matrix `U` of size nxm such that $\operatorname{trace}(X^T U) = 0$, i.e., `X(:).'*U(:) = 0` . |

Ambient space | $\mathbb{R}^{n\times m}$ | Points and vectors in the ambient space are, naturally, represented as matrices of size nxm. |

The following table shows some of the nontrivial available functions in the structure `M`

. The norm $\|\cdot\|$ refers to the norm in the ambient space, which is the Frobenius norm. The tutorial page gives more details about the functionality implemented by each function.

Name | Field usage | Formula |

Dimension | `M.dim() ` |
$\operatorname{dim}\mathcal{M} = nm-1$ |

Metric | `M.inner(X, U, V) ` |
$\langle U, V\rangle_X = \operatorname{trace}(U^T V)$ |

Norm | `M.norm(X, U) ` |
$\|U\|_X = \sqrt{\langle U, U \rangle_X}$ |

Distance | `M.dist(X, Y) ` |
$\operatorname{dist}(X, Y) = \arccos(\operatorname{trace}(X^T Y))$ |

Typical distance | `M.typicaldist() ` |
$\pi$ |

Tangent space projector | `M.proj(X, H) ` |
$P_X(H) = H - \operatorname{trace}(X^T H) X$, where `H` represents a vector in the ambient space. |

Euclidean to Riemannian gradient | `M.egrad2rgrad(X, egrad)` |
$\operatorname{grad} f(X) = P_X(\nabla f(X))$, where `egrad` represents the Euclidean gradient $\nabla f(X)$, which is a vector in the ambient space. |

Euclidean to Riemannian Hessian | `M.ehess2rhess(X, egrad, ehess, U)` |
$\operatorname{Hess} f(X)[U] = P_X(\nabla^2 f(X)[U]) - \operatorname{trace}(X^T \nabla f(X)) U$, where `egrad` represents the Euclidean gradient $\nabla f(X)$ and `ehess` represents the Euclidean Hessian $\nabla^2 f(X)[U]$, both being vectors in the ambient space. |

Exponential map | `M.exp(X, U, t) ` |
$\operatorname{Exp}_X(tU) = \cos(\|tU\|) X + \frac{\sin(\|tU\|)}{\|U\|} U$ |

Retraction | `M.retr(X, U, t) ` |
$\operatorname{Retr}_X(tU) = \frac{X+tU}{\|X+tU\|}$ (Frobenius norm in the ambient space) |

Logarithmic map | `M.log(X, Y) ` |
$\operatorname{Log}_X(Y) = \frac{\operatorname{dist}(X, Y)}{\|P_X(Y-X)\|} P_X(Y-X)$ |

Random point | `M.rand() ` |
Returns a point uniformly at random w.r.t. the natural measure as follows: generate $Y$ with i.i.d. normal entries; return $Y/\|Y\|$. |

Random vector | `M.randvec(X) ` |
Returns a unit-norm tangent vector at $X$ with uniformly random direction, obtained as follows: generate $H$ with i.i.d. normal entries; return $P_X(H)/\|P_X(H)\|$. |

Vector transport | `M.transp(X, Y, U) ` |
$\operatorname{Transp}_{Y\leftarrow X}(U) = P_Y(U)$, where $U$ is a tangent vector at $X$ that is transported to the tangent space at $Y$. |

Pair mean | `M.pairmean(X, Y) ` |
$\operatorname{mean}(X, Y) = \frac{X+Y}{\|X+Y\|}$ |

This example continues the example from the tutorial page. Let $A\in\mathbb{R}^{n\times n}$ be a symmetric matrix. We minimize the following cost function:

$$f(x) = -x^T Ax,$$

such that $x^T x = 1$, that is, such that $x$ belongs to the sphere $\mathbb{S}^{n-1}$. Compute the Euclidean gradient and Hessian of $f$:

$$\nabla f(x) = -2Ax,$$

$$\nabla^2 f(x)[u] = -2Au.$$

The Riemannian gradient and Hessian are obtained by applying the `M.egrad2rgrad`

and `M.ehess2rhess`

operators:

$$\operatorname{grad} f(x) = (I-xx^T) \nabla f(x),$$

$$\operatorname{Hess} f(x)[u] = (I-xx^T)\nabla^2 f(x)[u] - (x^T \nabla f(x)) u.$$

Notice that there is no need to compute these explicitly: it suffices to write code for the Euclidean quantities and to apply the conversion tools on them to obtain the Riemannian quantities, as in the following code:

% Generate the problem data. n = 1000; A = randn(n); A = .5*(A+A'); % Create the problem structure. manifold = spherefactory(n); problem.M = manifold; % Define the problem cost function and its derivatives.

problem.cost = @(x) -x'*(A*x);

egrad = @(x) -2*A*x;

ehess = @(x, u) -2*A*u; problem.grad = @(x) manifold.egrad2rgrad(x, egrad(x));

problem.hess = @(x, u) manifold.ehess2rhess(x, egrad(x), ehess(x, u), u);

% Numerically check the differentials.

checkgradient(problem); pause;

checkhessian(problem); pause;

% Solve

x = trustregions(problem);

Of course, this is not efficient since the computation of $Ax$ should be reused when computing $f(x)$ and $\nabla f(x)$, just like$\nabla f(x)$ should be reused to compute $\nabla^2 f(x)[u]$. See the various ways of describing the cost function for better alternatives.

For theory on Riemannian submanifolds, see [AMS08], section 3.6.1 (first-order derivatives) and section 5.3.3 (second-order derivatives, i.e., connections).

For content specifically about the sphere, see [AMS08] and look up examples 3.5.1 (tangent space), 3.6.1 (metric, projector), 4.1.1 (retraction), 5.3.1 (connection), 5.4.1 (exponential map, i.e., geodesics) and 8.1.1 (parallel translation, not implemented), 8.1.4 and 8.1.7 (vector transport).

[AMS08] P.-A. Absil, R. Mahony and R. Sepulchre, Optimization Algorithms on Matrix Manifolds, Princeton University Press, 2008.

The first example on the tutorial page is an example on the sphere. Section 6.4.1 in [AMS08] also gives an example of second-order optimization on the sphere.

We are still working on building a collection of examples for Manopt. The relevant ones will be referenced here when the time comes.