Home > manopt > manifolds > grassmann > grassmanngeneralizedfactory.m

grassmanngeneralizedfactory

PURPOSE ^

Returns a manifold struct of "scaled" vector subspaces.

SYNOPSIS ^

function M = grassmanngeneralizedfactory(n, p, B)

DESCRIPTION ^

 Returns a manifold struct of "scaled" vector subspaces.

 function M = grassmanngeneralizedfactory(n, p)
 function M = grassmanngeneralizedfactory(n, p, B)

 Generalized Grassmann manifold: each point on this manifold is a
 collection of "scaled" vector subspaces of dimension p embedded in R^n.
 The scaling is due to the symmetric positive definite matrix B.

 When B is identity, the manifold is the standard Grassmann manifold.

 The metric is obtained by viewing the generalized Grassmannian
 a Riemannian quotient manifold of the generalized Stiefel manifold, 
 which is the manifold of "scaled" orthonormal matrices. Specifically, 
 the scaled Stiefel manifold is the set {X : X'*B*X = I}. 
 The generalized Grassmann manifold is the Grassmannian of the 
 generalized Stiefel manifold.

 The generalized Stiefel manifold is endowed with a scaled metric
 by viewing it as a Riemannian submanifold of the Euclidean space, which
 is again endowed with the scaled inner product.

 Some notions (not all) are from Section 4.5 of the paper
 "The geometry of algorithms with orthogonality constraints",
 A. Edelman, T. A. Arias, S. T. Smith, SIMAX, 1998.

 Paper link: http://arxiv.org/abs/physics/9806030.

 
 Note: some computations such as restricted_svd, distance, logarithm, and 
 exponential are new and we believe them to be correct.
 Also, we hope that the computations are numerically stable.
 In case some things do not work out as expected or there is some trouble,
 please contact us at http://www.manopt.org.

 Note: egrad2rgrad and ehess2rhess involve solving linear systems in B. If
 this is a bottleneck for a specific application, then a way forward is to
 create a modified version of this file which preprocesses B to speed this
 up (typically, by computing a Cholesky factorization of it, then calling
 an appropriate solver).

 See also: stiefelgeneralizedfactory  stiefelfactory  grassmannfactory

CROSS-REFERENCE INFORMATION ^

This function calls: This function is called by:

SUBFUNCTIONS ^

SOURCE CODE ^

0001 function M = grassmanngeneralizedfactory(n, p, B)
0002 % Returns a manifold struct of "scaled" vector subspaces.
0003 %
0004 % function M = grassmanngeneralizedfactory(n, p)
0005 % function M = grassmanngeneralizedfactory(n, p, B)
0006 %
0007 % Generalized Grassmann manifold: each point on this manifold is a
0008 % collection of "scaled" vector subspaces of dimension p embedded in R^n.
0009 % The scaling is due to the symmetric positive definite matrix B.
0010 %
0011 % When B is identity, the manifold is the standard Grassmann manifold.
0012 %
0013 % The metric is obtained by viewing the generalized Grassmannian
0014 % a Riemannian quotient manifold of the generalized Stiefel manifold,
0015 % which is the manifold of "scaled" orthonormal matrices. Specifically,
0016 % the scaled Stiefel manifold is the set {X : X'*B*X = I}.
0017 % The generalized Grassmann manifold is the Grassmannian of the
0018 % generalized Stiefel manifold.
0019 %
0020 % The generalized Stiefel manifold is endowed with a scaled metric
0021 % by viewing it as a Riemannian submanifold of the Euclidean space, which
0022 % is again endowed with the scaled inner product.
0023 %
0024 % Some notions (not all) are from Section 4.5 of the paper
0025 % "The geometry of algorithms with orthogonality constraints",
0026 % A. Edelman, T. A. Arias, S. T. Smith, SIMAX, 1998.
0027 %
0028 % Paper link: http://arxiv.org/abs/physics/9806030.
0029 %
0030 %
0031 % Note: some computations such as restricted_svd, distance, logarithm, and
0032 % exponential are new and we believe them to be correct.
0033 % Also, we hope that the computations are numerically stable.
0034 % In case some things do not work out as expected or there is some trouble,
0035 % please contact us at http://www.manopt.org.
0036 %
0037 % Note: egrad2rgrad and ehess2rhess involve solving linear systems in B. If
0038 % this is a bottleneck for a specific application, then a way forward is to
0039 % create a modified version of this file which preprocesses B to speed this
0040 % up (typically, by computing a Cholesky factorization of it, then calling
0041 % an appropriate solver).
0042 %
0043 % See also: stiefelgeneralizedfactory  stiefelfactory  grassmannfactory
0044 
0045 
0046 % This file is part of Manopt: www.manopt.org.
0047 % Original author: Bamdev Mishra, June 30, 2015.
0048 % Contributors:
0049 %
0050 % Change log:
0051 %   Jan. 4, 2021 (NB):
0052 %       Moved symm() function to end of file for compatibility with Octave 6.1.0
0053     
0054     assert(n >= p, ...
0055         ['The dimension n of the ambient space must be larger ' ...
0056         'than the dimension p of the subspaces.']);
0057     
0058     if ~exist('B', 'var') || isempty(B)
0059         B = speye(n); % Standard Grassmann manifold.
0060     end
0061     
0062     M.name = @() sprintf('Generalized Grassmann manifold Gr(%d, %d)', n, p);
0063     
0064     M.dim = @() p*(n - p);   
0065     
0066     M.inner = @(X, eta, zeta) trace(eta'*(B*zeta)); % Scaled metric, but horizontally invariant.
0067     
0068     M.norm = @(X, eta) sqrt(M.inner(X, eta, eta));
0069     
0070     M.dist = @distance; 
0071     function d = distance(X, Y)
0072         XtBY = X'*(B*Y); % XtY ---> XtBY
0073         cos_princ_angle = svd(XtBY); % svd(XtY) ---> svd(XtBY)
0074         % Two next instructions not necessary: the imaginary parts that
0075         % would appear if the cosines are not between -1 and 1, when
0076         % passed to the acos function, would be very small, and would
0077         % thus vanish when the norm is taken.
0078         % cos_princ_angle = min(cos_princ_angle,  1);
0079         % cos_princ_angle = max(cos_princ_angle, -1);
0080         square_d = norm(acos(cos_princ_angle))^2;
0081         
0082         d = sqrt(square_d);
0083     end
0084     
0085     M.typicaldist = @() sqrt(p);
0086     
0087     
0088     % Orthogonal projection of an ambient vector U onto the
0089     % horizontal space at X.
0090     M.proj = @projection;
0091     function Up = projection(X, U)
0092         BX = B*X;
0093         
0094         % Projection onto the tangent space
0095         % U = U - X*symm(BX'*U);
0096         % Projection onto the horizontal space
0097         % Up = U - X*skew(BX'*U);
0098         
0099         Up = U - X*(BX'*U);
0100     end
0101     
0102     M.tangent = M.proj;
0103     
0104     M.egrad2rgrad = @egrad2rgrad;
0105     function rgrad = egrad2rgrad(X, egrad)
0106         
0107         % First, scale egrad according to the scaled metric in the
0108         % Euclidean space.
0109         egrad_scaled = B\egrad;
0110         
0111         % Second, project onto the tangent space.
0112         % No need to project onto the horizontal space as
0113         % by the Riemannian submersion theory, this quantity automatically
0114         % belongs to the horizontal space.
0115         %
0116         %
0117         % rgrad = egrad_scaled - X*symm((B*X)'*egrad_scaled);
0118         %
0119         % Verify that symm(BX'*egrad_scaled) = symm(X'*egrad).
0120         
0121         rgrad = egrad_scaled - X*symm(X'*egrad);
0122     end
0123     
0124     
0125     M.ehess2rhess = @ehess2rhess;
0126     function rhess = ehess2rhess(X, egrad, ehess, H)
0127         egraddot = ehess;
0128         Xdot = H;
0129         
0130         % Directional derivative of the Riemannian gradient.
0131         egrad_scaleddot = B\egraddot;
0132         rgraddot = egrad_scaleddot - Xdot*symm(X'*egrad)...
0133             - X*symm(Xdot'*egrad)...
0134             - X*symm(X'*egraddot);
0135         
0136         % Project onto the horizontal space.
0137         rhess = M.proj(X, rgraddot);
0138     end
0139     
0140     
0141     M.retr = @retraction;
0142     function Y = retraction(X, U, t)
0143         if nargin < 3
0144             t = 1.0;
0145         end
0146         Y = guf(X + t*U); % Ensure that Y'*B*Y is identity.
0147     end
0148     
0149     
0150     M.exp = @exponential;
0151     function Y = exponential(X, U, t)
0152         if nargin == 3
0153             tU = t*U;
0154         else
0155             tU = U;
0156         end
0157         
0158         % restricted_svd is defined later in the file.
0159         [u, s, v] = restricted_svd(tU);% svd(tU, 0) ---> restricted_svd(tU).
0160         cos_s = diag(cos(diag(s)));
0161         sin_s = diag(sin(diag(s)));
0162         Y = X*v*cos_s*v' + u*sin_s*v';% Verify that Y'*B*Y is identity
0163         
0164         % From numerical experiments, it seems necessary to
0165         % re-orthonormalize.
0166         Y = guf(Y);% Ensure that Y'*B*Y is identity.
0167     end
0168     
0169     
0170     
0171     % Test code for the logarithm:
0172     % gGr = grassmanngeneralizedfactory(5, 2, diag(rand(5,1)));
0173     % x = gGr.rand()
0174     % y = gGr.rand()
0175     % u = gGr.log(x, y)
0176     % gGr.dist(x, y) % These two numbers should
0177     % gGr.norm(x, u) % be the same.
0178     % z = gGr.exp(x, u) % z needs not be the same matrix as y, but it should
0179     % v = gGr.log(x, z) % be the same point as y on Grassmann: dist almost 0.
0180     % gGr.dist(z, y)
0181     M.log = @logarithm;
0182     function U = logarithm(X, Y)
0183         YtBX = Y'*(B*X); % YtX ---> YtBX.
0184         At = (Y' - YtBX*X');
0185         Bt = YtBX\At;
0186         [u, s, v] = restricted_svd(Bt');% svd(Bt', 'econ') ---> restricted_svd(Bt').
0187         
0188         u = u(:, 1:p);
0189         s = diag(s);
0190         s = s(1:p);
0191         v = v(:, 1:p);
0192         U = u*diag(atan(s))*v'; % A horizontal vector, i.e., U'*(B*X) is zero.
0193     end
0194     
0195     
0196     M.hash = @(X) ['z' hashmd5(X(:))];
0197     
0198     M.rand = @random;
0199     function X = random()
0200         X = guf(randn(n, p)); % Ensure that X'*B*X is identity;
0201     end
0202     
0203     M.randvec = @randomvec;
0204     function U = randomvec(X)
0205         U = projection(X, randn(n, p));
0206         U = U / norm(U(:));
0207     end
0208     
0209     M.lincomb = @matrixlincomb;
0210     
0211     M.zerovec = @(X) zeros(n, p);
0212     
0213     % This transport is compatible with the generalized polar retraction.
0214     M.transp = @(X1, X2, d) projection(X2, d);
0215     
0216     M.vec = @(X, u_mat) u_mat(:);
0217     M.mat = @(X, u_vec) reshape(u_vec, [n, p]);
0218     M.vecmatareisometries = @() false;
0219     
0220     % Some auxiliary functions
0221     function X = guf(Y)
0222         % Generalized polar decomposition of an n-by-p matrix Y.
0223         % X'*B*X is identity.
0224         
0225         % Method 1
0226         [u, ~, v] = svd(Y, 0);
0227   
0228         % Instead of the following three steps, an equivalent, but an
0229         % expensive, way is to do X = u*(sqrtm(u'*(B*u))\(v')).
0230         [q, ssquare] = eig(u'*(B*u));
0231         qsinv = q/sparse(diag(sqrt(diag(ssquare))));
0232         X = u*((qsinv*q')*v'); % X'*B*X is identity.
0233         
0234         
0235         % Another computation using restricted_svd
0236         % [u, ~, v] = restricted_svd(Y);
0237         % X = u*v'; % X'*B*X is identity.
0238         
0239     end
0240     
0241     function [u, s, v] = restricted_svd(Y)
0242         % We compute a thin svd-like decomposition of an n-by-p matrix Y
0243         % into matrices u, s, and v such that u is an n-by-p matrix
0244         % with u'*B*u being identity, s is a p-by-p diagonal matrix
0245         % with positive entries, and v is a p-by-p orthogonal matrix.
0246         % Y = u*s*v'.
0247         
0248         [v, ssquare] = eig(symm(Y'*(B*Y))); % Y*B*Y is positive definite
0249         ssquarevec = diag(ssquare);
0250         
0251         s = sparse(diag(abs(sqrt(ssquarevec))));
0252         u = Y*(v/s); % u'*B*u is identity.
0253     end
0254     
0255 end
0256 
0257 function A = symm(Z)
0258     A = (Z + Z')/2;
0259 end

Generated on Sun 05-Sep-2021 17:57:00 by m2html © 2005