The special orthogonal group $\mathrm{SO}(n)$ (the set of rotations in $\mathbb{R}^n$) is endowed with a Riemannian manifold structure by considering it as a Riemannian submanifold of the embedding Euclidean space $\mathbb{R}^{n\times n}$ endowed with the usual inner product $\langle H_1, H_2 \rangle = \operatorname{trace}(H_1^T H_2)$. Proper rotation matrices are orthogonal with determinant +1, as opposed to 1, that is: reflections are not allowed. The present factory allows working with several rotation matrices simultaneously and efficiently using 3D arrays of size nxnxk in Matlab. The geometry of $\mathrm{SO}(n)^k$ is obtained from the geometry of $\mathrm{SO}(n)$ in the obvious way by elementwise extension.
Factory call: M = rotationsfactory(n, k)
. By
default, k
equals 1.
Name  Formula and numerical representation 
Set  $\mathrm{SO}(n)^k = \{ X = (X_1, \ldots, X_k) \in (\mathbb{R}^{n\times n})^k : X_i^T X_i = I_n \textrm{ and } \det(X_i) = +1, i = 1:k \}$ 

$X$ is represented as a 3D array X of size nxnxk whose slices
are proper rotation matrices, i.e., X(:, :, i).'*X(:,
:, i) = eye(n) and det(X(:, :, i)) = 1
for i = 1:k . 
Tangent space at $X$  $T_X \mathrm{SO}(n)^k = \{ U = (X_1S_1, \ldots, X_kS_k) \in (\mathbb{R}^{n\times n})^k : S_i + S_i^T = 0, i = 1:k \}$ 

A tangent vector $U$ at $X$ is represented as an array S
of size nxnxk such
that each slice of S is a skewsymmetric
matrix, i.e., S(:, :, i) + S(:, :, i).' = 0
for i = 1:k . Thus, S numerically
contains $(S_1, \ldots, S_k)$, not $U$. This is important to
note. 
Ambient space  $(\mathbb{R}^{n\times n})^k$ 

Points and vectors in the ambient space are, naturally, represented as 3D arrays of size nxnxk. 
M
that take tangent vectors as input or output
work with skewsymmetric matrices. This skewsymmetric part is
often written $\Omega$; here, we call it $S$ to have a direct
correspondence between the math notation $S$ and the Matlab matrix
S
. In practice, for v = problem.ehess(x, u)
for example, u
is a representation of a tangent
vector: it is skewsymmetric (or an array of skewsymmetric
matrices). Use w = M.tangent2ambient(x, u)
to
convert u
to the actual vector in the ambient space
(obtained simply as x*u
in the case $k = 1$.) The
output v
, on the other hand, should be returned
simply as a vector in the ambient space (there is no conversion to
be done here.)The following table shows some of the nontrivial available
functions in the structure M
. The norm $\\cdot\$
refers to the norm in the ambient space, which is the Frobenius
norm. The tutorial page
gives more details about the functionality implemented by each
function.
Name  Field usage  Formula 
Dimension  M.dim() 
$\operatorname{dim}\mathcal{M} = k\frac{n(n1)}{2}$ 
Metric  M.inner(X, S, T) 
$\langle U, V\rangle_X = \sum_{i=1}^k
\operatorname{trace}(S_i^T T_i)$, where S and
T are representations of the tangent vectors $U
= (X_1S_1, \ldots, X_kS_k)$ and $V = (X_1T_1, \ldots,
X_kT_k)$. 
Norm  M.norm(X, S) 
$\U\_X = \sqrt{\langle U, U \rangle_X}$ 
Distance  M.dist(X, Y) 
$\operatorname{dist}(X, Y) = \sqrt{ \sum_{i=1}^k
\\log(X_i^TY_i)\^2 }$, where $\log$ denotes the matrix
logarithm logm . 
Typical distance  M.typicaldist() 
$\pi\sqrt{nk}$ 
Tangent space projector  S = M.proj(X, H) 
$(P_X(H))_i = X_i S_i$ with $S_i =
\operatorname{skew}(X_i^T H_i)$, for $i = 1\ldots k$ where H
represents a vector $H = (H_1, \ldots, H_k)$ in the ambient
space with slices H(:, :, i) corresponding to
$H_i$'s and $\operatorname{skew}$ extracts the
skewsymmetric part of a matrix: $\operatorname{skew}(A) =
\frac{AA^T}{2}$. Notice that only $S_i$ is computed since
tangent vectors $XS$ are represented using their
skewsymmetric part $S$ only. As a result, contrary to
intuition, applying proj twice is not
equivalent to applying proj once: this is
merely a question of numerical representation. See also the
function below. 
Tangent space to ambient space  H = M.tangent2ambient(X, S) 
Returns an ambient vector $H$ representing the tangent
vector which is represented by S , with H(:,
:, i) = X(:, :, i)*S(:, :, i) , for $i = 1\ldots k$.
For a given tangent vector (represented by the skew
symmetric slices S(:, :, i) ), returns a
representation of that vector in the ambient space, using
the slices H(:, :, i) . This function is
necessary because the proj operator takes as
input an ambient vector and returns a tangent vector. To
apply the proj again to the result (which
should change nothing), it is necessary to first represent
the tangent vector obtained as an ambient vector. 
Euclidean to Riemannian gradient  M.egrad2rgrad(X, egrad) 
$\operatorname{grad} f(X) = P_X(\nabla f(X))$, where egrad
is the Euclidean gradient $\nabla f(X)$, which is a vector
in the ambient space. 
Euclidean to Riemannian Hessian  M.ehess2rhess(X, egrad, ehess, S) 
$(\operatorname{Hess} f(X)[U])_i = P_{X_i}\left(
(\nabla^2 f(X)[U])_i  U_i
\operatorname{sym}\left( X_i^T (\nabla f(X))_i \right)
\right)$, for $i = 1\ldots k$, where S
represents the tangent vector $U = (X_1S_1, \ldots,
X_kS_k)$, egrad is the Euclidean gradient
$\nabla f(X)$, ehess is the Euclidean Hessian
$\nabla^2 f(X)[U]$ (the two latter being vectors in the
ambient space) and $\operatorname{sym}$ extracts the
symmetric part of a matrix: $\operatorname{sym}(A) =
\frac{A+A^T}{2}$. 
Exponential map  M.exp(X, S, t) 
$(\operatorname{Exp}_X(tU))_i =
X_i\operatorname{exp}(tS_i)$ where S
represents the tangent vector $U = (X_1S_1, \ldots, X_kS_k)$
and $\operatorname{exp}$ denotes the matrix exponential expm . 
Retraction  M.retr(X, S, t) 
$\left( \operatorname{Retr}_X(tU) \right)_i = \operatorname{qfactor}(X_i+tU_i)$, where $\operatorname{qfactor}$ extracts the Qfactor of the QR decomposition with positive elements on the diagonal of R. This is guaranteed to always return a proper rotation matrix, i.e., with determinant +1. 
Logarithmic map  S = M.log(X, Y) 
$(\operatorname{Log}_X(Y))_i = X_iS_i$, with $S_i =
\operatorname{log}(X_i^T Y_i)$ where $\operatorname{log}$
denotes the matrix logarithm logm . 
Random point  M.rand() 
Returns a point uniformly at random w.r.t. the natural measure, following the algorithm described in [Mez07]. Essentially, computes the Qfactor of a QR decomposition of a matrix with i.i.d. normal entries and makes sure the determinant is +1. 
Random vector  M.randvec(X) 
Returns a unitnorm tangent vector at $X$ with uniformly random direction. 
Vector transport  M.transp(X, Y, S) 
$\left(\operatorname{Transp}_{Y\leftarrow X}(U) \right)_i
= Y_iS_i$, where S represents a tangent vector
$U = (X_1S_1, \ldots, X_kS_k)$ at $X$ that is transported to
the tangent space at $Y$. Since tangent vectors are
represented in the Lie algebra, there is nothing to do: S
is returned untouched. Notice that this transport is an
isometry between tangent spaces: it preserves inner
products. 
Pair mean  M.pairmean(X, Y) 
$\left( \operatorname{mean}(X, Y) \right)_i = X_i \operatorname{exp}\left( \frac{\operatorname{log}(X_i^T Y_i)}{2} \right)$ 
Let $A, B\in\mathbb{R}^{n\times m}$ be two matrices representing clouds of $m$ points in $\mathbb{R}^n$. We search for the rotation matrix $X\in\mathrm{SO}(n)$ such that rotating all points in $B$ by $X$ will bring the points of $B$ as close as possible to the points of $A$, according to the Frobenius norm. This problem has a known solution (it is the Procrustes problem). We treat it merely for the sake of example. We minimize the cost $\frac{1}{2} \XB  A\^2 = \frac{1}{2} \A\^2 + \frac{1}{2} \B\^2  \operatorname{Trace}(B^T X^T A)$, which is equivalent to minimizing the following cost:
$$f(X) = \operatorname{Trace}(X^T \, AB^T),$$
such that $X \in \mathrm{SO}(n)$. Notice that we really are just looking for the closest rotation matrix to the matrix $AB^T$. Compute the Euclidean gradient and Hessian of $f$:
$$\nabla f(X) = AB^T,$$
$$\nabla^2 f(X)[XS] = 0.$$
Internally, the Riemannian gradient and Hessian are obtained by
applying the M.egrad2rgrad
and M.ehess2rhess
operators to these Euclidean quantities. This happens under the
hood, as in the following code:
% Generate the problem data.
n = 3;
m = 10;
A = randn(n, m);
B = randn(n, m);
ABt = A*B.';
% Create the problem structure. manifold = rotationsfactory(n, 1); problem.M = manifold; % Define the problem cost function and its Euclidean derivatives.
problem.cost = @(X) X(:).'*ABt(:); problem.egrad = @(X) ABt;
% Notice that, even though the Euclidean Hessian vanishes, the Riemannian Hessian does not.
% Indeed, plugging the Euclidean gradient and Hessian explicitly in the ehess2rhess
% formula gives a non trivial operator.
% In all generality, the Euclidean Hessian would not be zero: it would be a function of S.
% You would then need to transform S (which is a representation of the tangent vector)
% into an actual ambient vector corresponding to this tangent vector. In this case,
% it is simply X*S. More generally in Manopt, this operation is done via a call to
% manifold.tangent2ambient(X, S).
problem.ehess = @(X, S) manifold.zerovec(X);
% Numerically check the differentials.
checkgradient(problem); pause;
checkhessian(problem); pause;
% Solve the problem
X = trustregions(problem);
% Compare with the known optimal solution
[U, S, V] = svd(ABt);
UVt = U*V.';
% The determinant of UVt is either 1 or 1, in theory
if abs(1  det(UVt)) < 1e10
Xopt = UVt;
elseif abs(1  det(UVt)) < 1e10
% UVt is in O(n) but not SO(n). This is easily corrected for:
J = diag([ones(n1,1);1]);
Xopt = U*J*V.';
else
error('Should never ever happen ...');
end
fprintf('This should be small: %g\n', norm(Xopt  X, 'fro'));
Notice that if you work with the cost $f(X) = \frac{1}{2} \XB  A\^2$ directly, your Euclidean gradient becomes $\nabla f(X) = (XBA)B^T = XBB^T  AB^T$ and there seems to be an extra term in there. But this term, $XBB^T$, is of the form "$X$ times something symmetric", so that the orthogonal projection to the tangent space at $X$ will get rid of it.
More interestingly, one might try to solve the orthogonal Procrustes problem with more than two matrices, as in [TB77]. This is linked to the algorithms proposed in [BSA13], with Manopt code.
For theory on Riemannian submanifolds, see [AMS08], section 3.6.1 (firstorder derivatives) and section 5.3.3 (secondorder derivatives, i.e., connections).
For content specifically about the Lie group of rotations, [Bum04] is one of many resources.
For an application to the problem of interpolation and regression of rotation matrices, see [BA11].
For algorithms that generate uniformly random rotations, see [Mez07].
[AMS08] P.A. Absil, R. Mahony and R. Sepulchre, Optimization Algorithms
on Matrix Manifolds, Princeton University Press, 2008.
[BA11] N. Boumal and P.A. Absil, A
discrete regression method on manifolds and its application to
data on SO(n), 18th IFAC World Congress, 2011.
[Bum04] D. Bump, Lie
groups, vol. 225 of Graduate Texts in Mathematics. Springer,
2004.
[Mez07] F. Mezzadri, How
to generate random matrices from the classical compact groups,
Notices of the AMS, 54(5), 592–604, 2007.
[TB77] J.M.F. Ten Berge, Orthogonal
Procrustes rotation for two or more matrices, Psychometrika,
42(2), 267–276, 1977.
[BSA13] N. Boumal, A. Singer and P.A. Absil, Robust
estimation of rotations from relative measurements by maximum
likelihood, CDC 2013.