We denote the set of rank-$k$ matrices of size $m\times n$ as $\mathbb{R}_k ^{m \times n}$. We parameterize it as $X = LR^T$ where $L \in \mathbb{R}^{m\times k}_k$ and $R \in \mathbb{R}^{n\times k}_k$. Notice that this factorization is not unique. Indeed, for any invertible matrix $M$ of size $k$, $(L, R)$ and $(LM^{-1}, RM^T)$ are mapped to the same $X$. This observation leads to the identification of $\mathbb{R}_k ^{m \times n}$ with the quotient space $(\mathbb{R}_k ^{m \times k} \times \mathbb{R}_k ^{n \times k}) / \mathbb{R}_k ^{k \times k}$. Points on this quotient space are equivalence classes $[(L, R)] = \{(LM^{-1}, RM^T) : M \in \mathbb{R}_k^{k\times k}\}$. Conceptually, we move on the quotient space but matrix representations are done in the total space, $\mathbb{R}_k ^{m \times k} \times \mathbb{R}_k ^{n \times k} $.

A function $\phi :\mathbb{R}_k ^{m \times n} \rightarrow \mathbb{R}: X \mapsto \phi(X) $ induces a function $f: \mathbb{R}_k ^{m \times k} \times \mathbb{R}_k ^{n \times k} \rightarrow \mathbb{R}: (L, R) \mapsto f(L, R)$ where $f(L, R) = \phi(LR^T)$ (though in some developments belows we use the notation $f(X)$ for simplicity). The total space $\mathbb{R}_k ^{m \times k} \times \mathbb{R}_k ^{n \times k} $ is endowed with **a scaled metric that is different from the standard Euclidean metric**. This choice leads, in turn, to a Riemannian quotient geometry for $\mathbb{R}_k^{m \times n}$.

The following material is referenced from the papers [AAM12] and [MMBS12] (see below).

Factory call: `M = fixedrankfactory_2factors(m,n,k)`

.

Name | Formula | Numerical representation |

Set | $\mathbb{R}_k ^{m \times n} = \{ LR^T: L \in \mathbb{R}^{m \times k}_k, R \in \mathbb{R}^{n\times k}_k \} $ | $X = (L, R)$ is represented as a structure `X` with two fields. $L$ and $R$ are represented as matrices `X.L` and `X.R` of size $m\times k$ and $n \times k$ respectively. |

Horizontal space at $X$ | $\{ (U_L, U_R) \in \mathbb{R}^{m\times k} \times \mathbb{R}^{n\times k}: U_L^T L R^T R = L^T L R^T U_R \}$ | A horizontal vector $(U_L, U_R)$ is represented as a structure with two fields. $U_L$ and $U_R$ are represented with matrices `U.L` and `U.R` of size $m\times k$ and $n \times k$ respectively. |

Ambient space | $\mathbb{R}^{m\times k}\times \mathbb{R}^{n\times k}$ | Vectors in the ambient space are, naturally, represented in the same way as horizontal vectors, but without restrictions on the matrices `U.L` and `U.R` . |

The following table shows some of the nontrivial available functions in the structure `M`

. The tutorial page gives more details about the functionality implemented by each function.

Name | Field usage | Formula |

Dimension | `M.dim() ` |
$\operatorname{dim}\mathcal{M} = (m + n - k)k;$ |

Metric | `M.inner(X, U, V) ` |
$\langle U, V\rangle_X = \operatorname{trace}((L^TL)^{-1}U_L^T V_L) + \operatorname{trace}((R^T R)^{-1}U_R^T V_R)$ |

Norm | `M.norm(X, U) ` |
$\|U\|_X = \sqrt{\langle U, U \rangle_X}$ |

Horizontal space projector | `M.proj(X, H) ` |
$P_X(H) = (H_L + L\Lambda^T, H_R - R \Lambda)$, where the structure `H` represents a vector in the ambient space. $\Lambda$ is the unique solution to the Lyapunov equation, $\Lambda (L^T L) (R^T R) + (L^T L)( R^T R) \Lambda = (L^T L) (R^T H_R) - (H_L^T L) (R^T R)$. The output is a horizontal vector at $(L, R)$. |

Euclidean to Riemannian gradient | `M.egrad2rgrad(X, egrad)` |
$\operatorname{grad} f(L, R) = (\nabla_Lf(X)\ L^T L,\nabla_R f(X)\ R^T R)$, where `egrad` represents the Euclidean gradient as a structure with two fields corresponding to the partial derivatives $(\nabla _L f(X), \nabla_R f(X))$, which is a vector in the ambient space. |

Euclidean to Riemannian Hessian | `M.ehess2rhess(X, egrad, ehess, U)` |
$\operatorname{Hess} f(X)[U] = P_X(T_L, T_R)$, where \[ \begin{array}{lll} T_L = & (\nabla^2 f(X)[U])_L L^T L + 2\nabla_Lf(X){\rm sym}(L^T U_L) - U_L (L^T L)^{-1} {\rm sym}(L^T \nabla_Lf(X) L^T L) \\ & - \nabla_Lf(X) {\rm sym}(L^T U_L) + L (L^T L)^{-1} {\rm sym}(U_L ^T \nabla_Lf(X) L^T L ) \\ T_R = & (\nabla^2 f(X)[U])_R R^T R + 2\nabla_R f(X){\rm sym}(R^T U_R) - U_R (R^T R)^{-1} {\rm sym}(R^T \nabla_R f(X) R^T R) \\ & - \nabla_R f(X) {\rm sym}(R^T U_R) + R (R^T R)^{-1} {\rm sym}(U_R ^T \nabla_R f(X) R^T R ). \end{array} \] Here ${\rm sym}(\cdot)$ extracts the symmetric part of a square matrix, i.e., ${\rm sym}(A) = (A + A^T) /2$. `egrad` represents the Euclidean gradient $(\nabla _L f(X), \nabla _R f(X))$ and `ehess` represents the Euclidean Hessian $({(\nabla^2 f(X)[U])_L}, (\nabla^2 f(X)[U])_R )$, both being vectors in the ambient space. |

Retraction | `M.retr(X, U, t) ` |
$\operatorname{Retr}_X(tU) = (L + tU_L, R + tU_R)$ |

Random point | `M.rand() ` |
Returns a random point generated as follows: generate $L$ and $R$ with i.i.d. normal entries; return $(L, R)$. With high probability, $L$ and $R$ are indeed full-rank. |

Random vector | `M.randvec(X) ` |
Returns a horizontal vector at $X$ with uniformly random direction, obtained as follows: generate $H_L$ and $H_R$ with i.i.d. normal entries; return $P_X(H)/\|P_X(H)\|_X$. |

Vector transport | `M.transp(X, Y, U) ` |
$\operatorname{Transp}_{Y\leftarrow X}(U) = P_{(L_Y,R_Y)} (U)$, where $U$ is a horizontal vector at $(L_X, R_X)$ that is transported to the horizontal space at $(L_Y, R_Y)$. |

This is an example of low-rank matrix approximation. Let $A\in\mathbb{R}^{m\times n}$ be given. The best rank-$k$ approximation of $A$ w.r.t. the Frobenius norm can be computed via SVD of $A$. Here, instead, we will compute it using optimization on the manifold we just described. There is no claim that this is a good way to proceed for this particular problem: this is merely an example to illustrate the use of the manifold.

We wish to minimize the following cost function:

$$\phi(X) = \frac{1}{2}\| X - A \|_F ^2,$$

such that $X \in \mathbb{R}_k ^{m \times n}$. $X$ is factorized as $LR ^T$ with $L \in \mathbb{R}_k ^{m \times k}$ and $R \in \mathbb{R}_k ^{n \times k}$. The function $f:\mathbb{R}_k ^{m \times k} \times \mathbb{R}_k ^{n \times k} \rightarrow \mathbb{R}$ is defined as $f(L, R) = (1/2) \cdot \| LR^T - A \|_F ^2 $. Compute the Euclidean gradient and Hessian of $f$ w.r.t. $L$ and $R$:

$$\nabla _L f(X) = (LR^T - A)R,$$

$$\nabla _R f(X) = (LR^T - A)^T L,$$

$$(\nabla^2 f(X)[U])_L = (U_L R^T + LU_R^T )R + (LR^T - A)U_R, $$

$$(\nabla^2 f(X)[U])_R = (U_L R^T + LU_R^T )^T L + (LR^T - A)^T U_L.$$

The Riemannian gradient and Hessian are obtained automatically by applying the `M.egrad2rgrad`

and `M.ehess2rhess`

operators. After the prototyping stage, it may be helpful though to check what the expressions for the Riemannian gradient and Hessian turn out to be, as it is often possible to compute them more efficiently then with the generic conversion tools.

function fixedrank_test() % Generate the problem data. % Problem m = 500; n = 1000; A = randn(m, n); % We aim for a rank k approximation of A k = 5; % Create the problem structure. manifold = fixedrankfactory_2factors(m,n,k); problem.M = manifold; % Define the problem cost function and its gradient. problem.cost = @cost; function f = cost(X) f = .5*norm(X.L*X.R' - A, 'fro')^2; end problem.grad = @(X) problem.M.egrad2rgrad(X, egrad(X)); function g = egrad(X) P = (X.L*X.R' - A); g.L = P*X.R; g.R = P'*X.L; end problem.hess = @(X, U) problem.M.ehess2rhess(X, egrad(X), ehess(X, U), U); function Hess = ehess(X, eta) P = (X.L*X.R' - A); Pdot = (eta.L*X.R' + X.L*eta.R'); Hess.L = Pdot*X.R + P*eta.R; Hess.R = Pdot'*X.L + P'*eta.L; end % Numerically check the differentials. checkgradient(problem); pause; checkhessian(problem); pause; X = trustregions(problem); end

Notice that there are a lot of redundant computations in this implementation. For example, computing the gradient at $(L, R)$ will recompute the residue $LR^T - A$ which was probably already computed when the cost function was evaluated at the same point. See the tutorial for ways to avoid redundancy. This can result in significant speed ups.

Notice that the `checkhessian`

test fails: the slope is not right. This is because the retraction is not second-order compatible with the Riemannian exponential on this manifold, making the `checkhessian`

tool unusable. The Hessian is correct though. The warning about the exponential map can be disabled with this command: `warning('off', 'manopt:fixedrankfactory_2factors:exp')`

.

[MMBS12] B. Mishra, G. Meyer, S. Bonnabel and R. Sepulchre, Fixed-rank matrix factorizations and Riemannian low-rank optimization, Tech. report, arXiv:1209.0430, 2012

[AMS08] P.-A. Absil, R. Mahony and R. Sepulchre, Optimization Algorithms on Matrix Manifolds, Princeton University Press, 2008.