Home > manopt > manifolds > fixedrank > fixedrankembeddedfactory.m

# fixedrankembeddedfactory

## PURPOSE

Manifold struct to optimize fixed-rank matrices w/ an embedded geometry.

## SYNOPSIS

function M = fixedrankembeddedfactory(m, n, k)

## DESCRIPTION

Manifold struct to optimize fixed-rank matrices w/ an embedded geometry.

function M = fixedrankembeddedfactory(m, n, k)

Manifold of m-by-n real matrices of fixed rank k. This follows the
embedded geometry described in Bart Vandereycken's 2013 paper:
"Low-rank matrix completion by Riemannian optimization".

A point X on the manifold is represented as a structure with three
fields: U, S and V. The matrices U (mxk) and V (nxk) are orthonormal,
while the matrix S (kxk) is any /diagonal/ full-rank matrix.
Following the SVD formalism, X = U*S*V'. Note that the diagonal entries
of S are not constrained to be nonnegative.

Tangent vectors are represented as a structure with three fields: Up, M
and Vp. The matrices Up (mxk) and Vp (mxk) obey Up'*U = 0 and Vp'*V = 0.
The matrix M (kxk) is arbitrary. Such a structure corresponds to the
following tangent vector in the ambient space of mxn matrices:
Z = U*M*V' + Up*V' + U*Vp'
where (U, S, V) is the current point and (Up, M, Vp) is the tangent
vector at that point.

Vectors in the ambient space are best represented as mxn matrices. If
these are low-rank, they may also be represented as structures with
U, S, V fields, such that Z = U*S*V'. There are no restrictions on what
U, S and V are, as long as their product as indicated yields a real, mxn
matrix.

The chosen geometry yields a Riemannian submanifold of the embedding
space R^(mxn) equipped with the usual trace (Frobenius) inner product.

The tools
X_triplet = M.matrix2triplet(X_matrix) and
X_matrix = M.triplet2matrix(X_triplet)
can be used to easily convert between full matrix representation (as a
matrix of size mxn) and triplet representation as a structure with fields
U, S, V. The tool matrix2triplet also accepts an optional second input r
to choose the rank of the triplet representation. By default, r = k. If
the input matrix X_matrix has rank more than r, the triplet represents
its best rank-r approximation in the Frobenius norm (truncated SVD).
Note that these conversions are computationally expensive for large m
and n: this is mostly useful for small matrices and for prototyping.

Please cite the Manopt paper as well as the research paper:
@Article{vandereycken2013lowrank,
Title   = {Low-rank matrix completion by {Riemannian} optimization},
Author  = {Vandereycken, B.},
Journal = {SIAM Journal on Optimization},
Year    = {2013},
Number  = {2},
Pages   = {1214--1236},
Volume  = {23},
Doi     = {10.1137/110845768}
}

## CROSS-REFERENCE INFORMATION

This function calls:
• stiefelfactory Returns a manifold structure to optimize over orthonormal matrices.
• norm NORM Norm of a TT/MPS tensor.
• norm NORM Norm of a TT/MPS block-mu tensor.
• hashmd5 Computes the MD5 hash of input data.
• lincomb Computes a linear combination of tangent vectors in the Manopt framework.
This function is called by:

## SOURCE CODE

0001 function M = fixedrankembeddedfactory(m, n, k)
0002 % Manifold struct to optimize fixed-rank matrices w/ an embedded geometry.
0003 %
0004 % function M = fixedrankembeddedfactory(m, n, k)
0005 %
0006 % Manifold of m-by-n real matrices of fixed rank k. This follows the
0007 % embedded geometry described in Bart Vandereycken's 2013 paper:
0008 % "Low-rank matrix completion by Riemannian optimization".
0009 %
0011 %
0012 % A point X on the manifold is represented as a structure with three
0013 % fields: U, S and V. The matrices U (mxk) and V (nxk) are orthonormal,
0014 % while the matrix S (kxk) is any /diagonal/ full-rank matrix.
0015 % Following the SVD formalism, X = U*S*V'. Note that the diagonal entries
0016 % of S are not constrained to be nonnegative.
0017 %
0018 % Tangent vectors are represented as a structure with three fields: Up, M
0019 % and Vp. The matrices Up (mxk) and Vp (mxk) obey Up'*U = 0 and Vp'*V = 0.
0020 % The matrix M (kxk) is arbitrary. Such a structure corresponds to the
0021 % following tangent vector in the ambient space of mxn matrices:
0022 %   Z = U*M*V' + Up*V' + U*Vp'
0023 % where (U, S, V) is the current point and (Up, M, Vp) is the tangent
0024 % vector at that point.
0025 %
0026 % Vectors in the ambient space are best represented as mxn matrices. If
0027 % these are low-rank, they may also be represented as structures with
0028 % U, S, V fields, such that Z = U*S*V'. There are no restrictions on what
0029 % U, S and V are, as long as their product as indicated yields a real, mxn
0030 % matrix.
0031 %
0032 % The chosen geometry yields a Riemannian submanifold of the embedding
0033 % space R^(mxn) equipped with the usual trace (Frobenius) inner product.
0034 %
0035 % The tools
0036 %    X_triplet = M.matrix2triplet(X_matrix) and
0037 %    X_matrix = M.triplet2matrix(X_triplet)
0038 % can be used to easily convert between full matrix representation (as a
0039 % matrix of size mxn) and triplet representation as a structure with fields
0040 % U, S, V. The tool matrix2triplet also accepts an optional second input r
0041 % to choose the rank of the triplet representation. By default, r = k. If
0042 % the input matrix X_matrix has rank more than r, the triplet represents
0043 % its best rank-r approximation in the Frobenius norm (truncated SVD).
0044 % Note that these conversions are computationally expensive for large m
0045 % and n: this is mostly useful for small matrices and for prototyping.
0046 %
0047 %
0048 % Please cite the Manopt paper as well as the research paper:
0049 %     @Article{vandereycken2013lowrank,
0050 %       Title   = {Low-rank matrix completion by {Riemannian} optimization},
0051 %       Author  = {Vandereycken, B.},
0052 %       Journal = {SIAM Journal on Optimization},
0053 %       Year    = {2013},
0054 %       Number  = {2},
0055 %       Pages   = {1214--1236},
0056 %       Volume  = {23},
0057 %       Doi     = {10.1137/110845768}
0058 %     }
0059 %
0061
0062 % This file is part of Manopt: www.manopt.org.
0063 % Original author: Nicolas Boumal, Dec. 30, 2012.
0064 % Contributors: Bart Vandereycken, Eitan Levin
0065 % Change log:
0066 %
0067 %   Feb. 20, 2014 (NB):
0069 %
0070 %   June 24, 2014 (NB):
0071 %       A couple modifications following Bart's feedback:
0072 %       - The checksum (hash) was replaced for a faster alternative: it's a
0073 %         bit less "safe" in that collisions could arise with higher
0074 %         probability, but they're still very unlikely.
0075 %       - The vector transport was changed.
0076 %       The typical distance was also modified, hopefully giving the
0077 %       trustregions method a better initial guess for the trust-region
0078 %       radius, but that should be tested for different cost functions too.
0079 %
0080 %    July 11, 2014 (NB):
0081 %       Added ehess2rhess and tangent2ambient, supplied by Bart.
0082 %
0083 %    July 14, 2014 (NB):
0084 %       Added vec, mat and vecmatareisometries so that hessianspectrum now
0085 %       works with this geometry. Implemented the tangent function.
0086 %       Made it clearer in the code and in the documentation in what format
0087 %       ambient vectors may be supplied, and generalized some functions so
0088 %       that they should now work with both accepted formats.
0089 %       It is now clearly stated that for a point X represented as a
0090 %       triplet (U, S, V), the matrix S needs to be diagonal.
0091 %
0092 %    Sep.  6, 2018 (NB):
0093 %       Removed M.exp() as it was not implemented.
0094 %
0095 %    March 20, 2019 (NB):
0096 %       Added M.matrix2triplet and M.triplet2matrix to allow easy
0097 %       conversion between matrix representations either as full matrices
0098 %       or as triplets (U, S, V).
0099 %
0100 %    Dec. 14, 2019 (EL):
0101 %       The original retraction code was repaced with a somewhat slower but
0102 %       numerically more stable version. With the original code, trouble
0103 %       could arise when the matrices Up, Vp defining the tangent vector
0104 %       being retracted were ill-conditioned.
0105 %
0106 %    Jan. 28, 2020 (NB):
0107 %       In retraction code, moved parameter t around to highlight the fact
0108 %       that it comes up in only one computation.
0109 %       Replaced vec/mat codes: they are still isometries, but they produce
0110 %       representations of length k*(m+n+k) instead of m*n, which is much
0111 %       more efficient: it only exceeds the true dimension by 2k^2. Also,
0112 %       mat does not attempt to project to the tangent space (which it did
0113 %       before but shouldn't): compose mat with tangent if that is the
0114 %       desired effect.
0115
0116     M.name = @() sprintf('Manifold of %dx%d matrices of rank %d', m, n, k);
0117
0118     M.dim = @() (m+n-k)*k;
0119
0120     M.inner = @(x, d1, d2) d1.M(:).'*d2.M(:) + d1.Up(:).'*d2.Up(:) ...
0121                                              + d1.Vp(:).'*d2.Vp(:);
0122
0123     M.norm = @(x, d) sqrt(norm(d.M, 'fro')^2 + norm(d.Up, 'fro')^2 ...
0124                                              + norm(d.Vp, 'fro')^2);
0125
0126     M.dist = @(x, y) error('fixedrankembeddedfactory.dist not implemented yet.');
0127
0128     M.typicaldist = @() M.dim();
0129
0130     % Given Z in tangent vector format, projects the components Up and Vp
0131     % such that they satisfy the tangent space constraints up to numerical
0132     % errors. If Z was indeed a tangent vector at X, this should barely
0133     % affect Z (it would not at all if we had infinite numerical accuracy).
0134     M.tangent = @tangent;
0135     function Z = tangent(X, Z)
0136         Z.Up = Z.Up - X.U*(X.U'*Z.Up);
0137         Z.Vp = Z.Vp - X.V*(X.V'*Z.Vp);
0138     end
0139
0140     % For a given ambient vector Z, applies it to a matrix W. If Z is given
0141     % as a matrix, this is straightforward. If Z is given as a structure
0142     % with fields U, S, V such that Z = U*S*V', the product is executed
0143     % efficiently.
0144     function ZW = apply_ambient(Z, W)
0145         if ~isstruct(Z)
0146             ZW = Z*W;
0147         else
0148             ZW = Z.U*(Z.S*(Z.V'*W));
0149         end
0150     end
0151
0152     % Same as apply_ambient, but applies Z' to W.
0153     function ZtW = apply_ambient_transpose(Z, W)
0154         if ~isstruct(Z)
0155             ZtW = Z'*W;
0156         else
0157             ZtW = Z.V*(Z.S'*(Z.U'*W));
0158         end
0159     end
0160
0161     % Orthogonal projection of an ambient vector Z represented as an mxn
0162     % matrix or as a structure with fields U, S, V to the tangent space at
0163     % X, in a tangent vector structure format.
0164     M.proj = @projection;
0165     function Zproj = projection(X, Z)
0166
0167         ZV = apply_ambient(Z, X.V);
0168         UtZV = X.U'*ZV;
0169         ZtU = apply_ambient_transpose(Z, X.U);
0170
0171         Zproj.M = UtZV;
0172         Zproj.Up = ZV  - X.U*UtZV;
0173         Zproj.Vp = ZtU - X.V*UtZV';
0174
0175     end
0176
0178
0179     % Code supplied by Bart.
0180     % Given the Euclidean gradient at X and the Euclidean Hessian at X
0181     % along H, where egrad and ehess are vectors in the ambient space and H
0182     % is a tangent vector at X, returns the Riemannian Hessian at X along
0183     % H, which is a tangent vector.
0184     M.ehess2rhess = @ehess2rhess;
0185     function rhess = ehess2rhess(X, egrad, ehess, H)
0186
0187         % Euclidean part
0188         rhess = projection(X, ehess);
0189
0190         % Curvature part
0192         rhess.Up = rhess.Up + (T - X.U*(X.U'*T));
0194         rhess.Vp = rhess.Vp + (T - X.V*(X.V'*T));
0195
0196     end
0197
0198     % Transforms a tangent vector Z represented as a structure (Up, M, Vp)
0199     % into a structure with fields (U, S, V) that represents that same
0200     % tangent vector in the ambient space of mxn matrices, as U*S*V'.
0201     % This matrix is equal to X.U*Z.M*X.V' + Z.Up*X.V' + X.U*Z.Vp'. The
0202     % latter is an mxn matrix, which could be too large to build
0203     % explicitly, and this is why we return a low-rank representation
0204     % instead. Note that there are no guarantees on U, S and V other than
0205     % that USV' is the desired matrix. In particular, U and V are not (in
0206     % general) orthonormal and S is not (in general) diagonal.
0207     % (In this implementation, S is identity, but this might change.)
0208     M.tangent2ambient_is_identity = false;
0209     M.tangent2ambient = @tangent2ambient;
0210     function Zambient = tangent2ambient(X, Z)
0211         Zambient.U = [X.U*Z.M + Z.Up, X.U];
0212         Zambient.S = eye(2*k);
0213         Zambient.V = [X.V, Z.Vp];
0214     end
0215
0216     % This retraction is second order, following general results from
0217     % Absil, Malick, "Projection-like retractions on matrix manifolds",
0218     % SIAM J. Optim., 22 (2012), pp. 135-158.
0219     %
0220     % Notice that this retraction is only locally smooth. Indeed, the
0221     % following code exhibits a discontinuity when retracting from
0222     % X = [1 0 ; 0 0] along V = [0 0 ; 0 1]:
0223     %
0224     % M = fixedrankembeddedfactory(2, 2, 1);
0225     % X = struct('U', [1;0], 'V', [1;0], 'S', 1);
0226     % V = struct('Up', [0;1], 'Vp', [0;1], 'M', 1);
0227     % entry = @(M) M(1, 1);
0228     % mat = @(X) X.U*X.S*X.V';
0229     % g = @(t) entry(mat(M.retr(X, V, t)));
0230     % ezplot(g, [-2, 2]);
0231     %
0232     M.retr = @retraction;
0233     function Y = retraction(X, Z, t)
0234         if nargin < 3
0235             t = 1.0;
0236         end
0237
0238         % Mathematically, Z.Up is orthogonal to X.U, and likewise for
0239         % Z.Vp compared to X.V. Thus, in principle, we could call QR
0240         % on Z.Up and Z.Vp alone, which should be about 4 times faster
0241         % than the calls here where we orthonormalize twice as many
0242         % vectors. However, when Z.Up, Z.Vp are poorly conditioned,
0243         % orthonormalizing them can lead to loss of orthogonality
0244         % against X.U, X.V.
0245         [Qu, Ru] = qr([X.U, Z.Up], 0);
0246         [Qv, Rv] = qr([X.V, Z.Vp], 0);
0247
0248         % Calling svds or svd should yield the same result, but BV
0249         % advocated svd is more robust, and it doesn't change the
0250         % asymptotic complexity to call svd then trim rather than call
0251         % svds. Also, apparently Matlab calls ARPACK in a suboptimal way
0252         % for svds in this scenario.
0253         % Notice that the parameter t appears only here. Thus, in princple,
0254         % we could make some savings for line-search procedures where we
0255         % retract the same vector multiple times, only with different
0256         % values of t. The asymptotic complexity remains the same though
0257         % (up to a constant factor) because of the matrix-matrix products
0258         % below which cost essentially the same as the QR factorizations.
0259         [U, S, V] = svd(Ru*[X.S + t*Z.M, t*eye(k); t*eye(k), zeros(k)]*Rv');
0260
0261         Y.U = Qu*U(:, 1:k);
0262         Y.V = Qv*V(:, 1:k);
0263         Y.S = S(1:k, 1:k);
0264
0265         % Equivalent but very slow code
0266         % [U, S, V] = svds(X.U*X.S*X.V' + t*(X.U*Z.M*X.V' + Z.Up*X.V' + X.U*Z.Vp'), k);
0267         % Y.U = U; Y.V = V; Y.S = S;
0268     end
0269
0270
0271     % Orthographic retraction provided by Teng Zhang. One interest of the
0272     % orthographic retraction is that if matrices are represented in full
0273     % size, it can be computed without any SVDs. If for an application it
0274     % makes sense to represent the matrices in full size, this may be a
0275     % good idea, but it won't shine in the present implementation of the
0276     % manifold.
0277     M.retr_ortho = @retraction_orthographic;
0278     function Y = retraction_orthographic(X, Z, t)
0279         if nargin < 3
0280             t = 1.0;
0281         end
0282
0283         % First, write Y (the output) as U1*S0*V1', where U1 and V1 are
0284         % orthogonal matrices and S0 is of size r by r.
0285         [U1, ~] = qr(t*(X.U*Z.M  + Z.Up) + X.U*X.S, 0);
0286         [V1, ~] = qr(t*(X.V*Z.M' + Z.Vp) + X.V*X.S, 0);
0287         S0 = (U1'*X.U)*(X.S + t*Z.M)*(X.V'*V1) ...
0288                          + t*((U1'*Z.Up)*(X.V'*V1) + (U1'*X.U)*(Z.Vp'*V1));
0289
0290         % Then, obtain the singular value decomposition of Y.
0291         [U2, S2, V2] = svd(S0);
0292         Y.U = U1*U2;
0293         Y.S = S2;
0294         Y.V = V1*V2;
0295
0296     end
0297
0298
0299     % Less safe but much faster checksum, June 24, 2014.
0300     % Older version right below.
0301     M.hash = @(X) ['z' hashmd5([sum(X.U(:)) ; sum(X.S(:)); sum(X.V(:)) ])];
0302     %M.hash = @(X) ['z' hashmd5([X.U(:) ; X.S(:) ; X.V(:)])];
0303
0304     M.rand = @random;
0305     % Factors U, V live on Stiefel manifolds: reuse their random generator.
0306     stiefelm = stiefelfactory(m, k);
0307     stiefeln = stiefelfactory(n, k);
0308     function X = random()
0309         X.U = stiefelm.rand();
0310         X.V = stiefeln.rand();
0311         X.S = diag(sort(rand(k, 1), 1, 'descend'));
0312     end
0313
0314     % Generate a random tangent vector at X.
0315     % Note: this may not be the uniform distribution over the set of
0316     % unit-norm tangent vectors.
0317     M.randvec = @randomvec;
0318     function Z = randomvec(X)
0319         Z.M  = randn(k);
0320         Z.Up = randn(m, k);
0321         Z.Vp = randn(n, k);
0322         Z = tangent(X, Z);
0323         nrm = M.norm(X, Z);
0324         Z.M  = Z.M  / nrm;
0325         Z.Up = Z.Up / nrm;
0326         Z.Vp = Z.Vp / nrm;
0327     end
0328
0329     M.lincomb = @lincomb;
0330
0331     M.zerovec = @(X) struct('M', zeros(k, k), 'Up', zeros(m, k), ...
0332                                               'Vp', zeros(n, k));
0333
0334     % New vector transport on June 24, 2014 (as indicated by Bart)
0335     % Reference: Absil, Mahony, Sepulchre 2008 section 8.1.3:
0336     % For Riemannian submanifolds of a Euclidean space, it is acceptable to
0337     % transport simply by orthogonal projection of the tangent vector
0338     % translated in the ambient space.
0339     M.transp = @project_tangent;
0340     function Z2 = project_tangent(X1, X2, Z1)
0341         Z2 = projection(X2, tangent2ambient(X1, Z1));
0342     end
0343
0344     % The function 'vec' is isometric from the tangent space at X to real
0345     % vectors of length k(m+n+k). The function 'mat' is the left-inverse
0346     % of 'vec'. It is sometimes useful to apply 'tangent' to the output of
0347     % 'mat'.
0348     M.vec = @vec;
0349     function Zvec = vec(X, Z) %#ok<INUSL>
0350         A = Z.M;
0351         B = Z.Up;
0352         C = Z.Vp;
0353         Zvec = [A(:) ; B(:) ; C(:)];
0354     end
0355     rangeM = 1:(k^2);
0356     rangeUp = (k^2)+(1:(m*k));
0357     rangeVp = (k^2+m*k)+(1:(n*k));
0358     M.mat = @(X, Zvec) struct('M',  reshape(Zvec(rangeM),  [k, k]), ...
0359                               'Up', reshape(Zvec(rangeUp), [m, k]), ...
0360                               'Vp', reshape(Zvec(rangeVp), [n, k]));
0361     M.vecmatareisometries = @() true;
0362
0363
0364     % It is sometimes useful to switch between representation of matrices
0365     % as triplets or as full matrices of size m x n. The function to
0366     % convert a matrix to a triplet, matrix2triplet, allows to specify the
0367     % rank of the representation. By default, it is equal to k. Omit the
0368     % second input (or set to inf) to get a full SVD triplet (in economy
0369     % format). If so, the resulting triplet does not represent a point on
0370     % the manifold.
0371     M.matrix2triplet = @matrix2triplet;
0372     function X_triplet = matrix2triplet(X_matrix, r)
0373         if ~exist('r', 'var') || isempty(r) || r <= 0
0374             r = k;
0375         end
0376         if r < min(m, n)
0377             [U, S, V] = svds(X_matrix, r);
0378         else
0379             [U, S, V] = svd(X_matrix, 'econ');
0380         end
0381         X_triplet.U = U;
0382         X_triplet.S = S;
0383         X_triplet.V = V;
0384     end
0385     M.triplet2matrix = @triplet2matrix;
0386     function X_matrix = triplet2matrix(X_triplet)
0387         U = X_triplet.U;
0388         S = X_triplet.S;
0389         V = X_triplet.V;
0390         X_matrix = U*S*V';
0391     end
0392
0393 end
0394
0395 % Linear combination of tangent vectors
0396 function d = lincomb(x, a1, d1, a2, d2) %#ok<INUSL>
0397
0398     if nargin == 3
0399         d.Up = a1*d1.Up;
0400         d.Vp = a1*d1.Vp;
0401         d.M  = a1*d1.M;
0402     elseif nargin == 5
0403         d.Up = a1*d1.Up + a2*d2.Up;
0404         d.Vp = a1*d1.Vp + a2*d2.Vp;
0405         d.M  = a1*d1.M  + a2*d2.M;
0406     else
0407         error('fixedrank.lincomb takes either 3 or 5 inputs.');
0408     end
0409
0410 end

Generated on Fri 30-Sep-2022 13:18:25 by m2html © 2005