The sphere $\mathbb{S}^{nm-1}$ (the set of unit Frobenius norm matrices of size nxm) is endowed with a Riemannian manifold structure by considering it as a Riemannian submanifold of the embedding Euclidean space $\mathbb{R}^{n\times m}$ endowed with the usual inner product $\langle H_1, H_2 \rangle = \operatorname{trace}(H_1^T H_2)$.

Factory call: M = spherefactory(n, m). By default, m equals 1. If you need to work with several vectors of unit norm, have a look at the oblique manifold.

Name Formula Numerical representation
Set $\mathbb{S}^{nm-1} = \{ X \in \mathbb{R}^{n\times m} : \|X\| = 1 \}$ $X$ is represented as a matrix X of size nxm with unit Frobenius norm, i.e., X(:).'*X(:) = 1.
Tangent space at $X$ $T_X \mathbb{S}^{nm-1} = \{ U \in \mathbb{R}^{n\times m} : \operatorname{trace}(X^T U) = 0 \}$ A tangent vector $U$ at $X$ is represented as a matrix U of size nxm such that $\operatorname{trace}(X^T U) = 0$, i.e., X(:).'*U(:) = 0.
Ambient space $\mathbb{R}^{n\times m}$ Points and vectors in the ambient space are, naturally, represented as matrices of size nxm.

The following table shows some of the nontrivial available functions in the structure M. The norm $\|\cdot\|$ refers to the norm in the ambient space, which is the Frobenius norm. The tutorial page gives more details about the functionality implemented by each function.

Name Field usage Formula
Dimension M.dim() $\operatorname{dim}\mathcal{M} = nm-1$
Metric M.inner(X, U, V) $\langle U, V\rangle_X = \operatorname{trace}(U^T V)$
Norm M.norm(X, U) $\|U\|_X = \sqrt{\langle U, U \rangle_X}$
Distance M.dist(X, Y) $\operatorname{dist}(X, Y) = \arccos(\operatorname{trace}(X^T Y))$
Typical distance M.typicaldist() $\pi$
Tangent space projector M.proj(X, H) $P_X(H) = H - \operatorname{trace}(X^T H) X$, where H represents a vector in the ambient space.
Euclidean to Riemannian gradient M.egrad2rgrad(X, egrad) $\operatorname{grad} f(X) = P_X(\nabla f(X))$, where egrad represents the Euclidean gradient $\nabla f(X)$, which is a vector in the ambient space.
Euclidean to Riemannian Hessian M.ehess2rhess(X, egrad, ehess, U) $\operatorname{Hess} f(X)[U] = P_X(\nabla^2 f(X)[U]) - \operatorname{trace}(X^T \nabla f(X)) U$, where egrad represents the Euclidean gradient $\nabla f(X)$ and ehess represents the Euclidean Hessian $\nabla^2 f(X)[U]$, both being vectors in the ambient space.
Exponential map M.exp(X, U, t) $\operatorname{Exp}_X(tU) = \cos(\|tU\|) X + \frac{\sin(\|tU\|)}{\|U\|} U$
Retraction M.retr(X, U, t) $\operatorname{Retr}_X(tU) = \frac{X+tU}{\|X+tU\|}$ (Frobenius norm in the ambient space)
Logarithmic map M.log(X, Y) $\operatorname{Log}_X(Y) = \frac{\operatorname{dist}(X, Y)}{\|P_X(Y-X)\|} P_X(Y-X)$
Random point M.rand() Returns a point uniformly at random w.r.t. the natural measure as follows: generate $Y$ with i.i.d. normal entries; return $Y/\|Y\|$.
Random vector M.randvec(X) Returns a unit-norm tangent vector at $X$ with uniformly random direction, obtained as follows: generate $H$ with i.i.d. normal entries; return $P_X(H)/\|P_X(H)\|$.
Vector transport M.transp(X, Y, U) $\operatorname{Transp}_{Y\leftarrow X}(U) = P_Y(U)$, where $U$ is a tangent vector at $X$ that is transported to the tangent space at $Y$.
Pair mean M.pairmean(X, Y) $\operatorname{mean}(X, Y) = \frac{X+Y}{\|X+Y\|}$

This example continues the example from the tutorial page. Let $A\in\mathbb{R}^{n\times n}$ be a symmetric matrix. We minimize the following cost function:

$$f(x) = -x^T Ax,$$

such that $x^T x = 1$, that is, such that $x$ belongs to the sphere $\mathbb{S}^{n-1}$. Compute the Euclidean gradient and Hessian of $f$:

$$\nabla f(x) = -2Ax,$$

$$\nabla^2 f(x)[u] = -2Au.$$

The Riemannian gradient and Hessian are obtained by applying the M.egrad2rgrad and M.ehess2rhess operators:

$$\operatorname{grad} f(x) = (I-xx^T) \nabla f(x),$$

$$\operatorname{Hess} f(x)[u] = (I-xx^T)\nabla^2 f(x)[u] - (x^T \nabla f(x)) u.$$

Notice that there is no need to compute these explicitly: it suffices to write code for the Euclidean quantities and to apply the conversion tools on them to obtain the Riemannian quantities, as in the following code:

% Generate the problem data.
n = 1000;
A = randn(n);
A = .5*(A+A');

% Create the problem structure.
manifold = spherefactory(n);
problem.M = manifold;

% Define the problem cost function and its derivatives.
problem.cost = @(x) -x'*(A*x);
egrad = @(x) -2*A*x;
ehess = @(x, u) -2*A*u; problem.grad = @(x) manifold.egrad2rgrad(x, egrad(x));
problem.hess = @(x, u) manifold.ehess2rhess(x, egrad(x), ehess(x, u), u);

% Numerically check the differentials.
checkgradient(problem); pause;
checkhessian(problem); pause;

% Solve
x = trustregions(problem);

Of course, this is not efficient since the computation of $Ax$ should be reused when computing $f(x)$ and $\nabla f(x)$, just like$\nabla f(x)$ should be reused to compute $\nabla^2 f(x)[u]$. See the various ways of describing the cost function for better alternatives.

For theory on Riemannian submanifolds, see [AMS08], section 3.6.1 (first-order derivatives) and section 5.3.3 (second-order derivatives, i.e., connections).

For content specifically about the sphere, see [AMS08] and look up examples 3.5.1 (tangent space), 3.6.1 (metric, projector), 4.1.1 (retraction), 5.3.1 (connection), 5.4.1 (exponential map, i.e., geodesics) and 8.1.1 (parallel translation, not implemented), 8.1.4 and 8.1.7 (vector transport).

[AMS08] P.-A. Absil, R. Mahony and R. Sepulchre, Optimization Algorithms on Matrix Manifolds, Princeton University Press, 2008.

The first example on the tutorial page is an example on the sphere. Section 6.4.1 in [AMS08] also gives an example of second-order optimization on the sphere.

We are still working on building a collection of examples for Manopt. The relevant ones will be referenced here when the time comes.