Description of sympositivedefinitefactory

0001 function M = sympositivedefinitefactory(n)
0002 % Manifold of n-by-n symmetric positive definite matrices with
0003 % the bi-invariant geometry.
0004 %
0005 % function M = sympositivedefinitefactory(n)
0006 %
0007 % A point X on the manifold is represented as a symmetric positive definite
0008 % matrix X (nxn). Tangent vectors are symmetric matrices of the same size
0009 % (but not necessarily definite).
0010 %
0011 % The Riemannian metric is the bi-invariant metric, described notably in
0012 % Chapter 6 of the 2007 book "Positive definite matrices"
0013 % by Rajendra Bhatia, Princeton University Press.
0014 %
0015 %
0016 % The retraction / exponential map involves expm (the matrix exponential).
0017 % If too large a vector is retracted / exponentiated (e.g., a solver tries
0018 % to make too big a step), this may result in NaN's in the returned point,
0019 % which most likely would lead to NaN's in the cost / gradient / ... and
0020 % will result in failure of the optimization. For trustregions, this can be
0021 % controlled by setting options.Delta0 and options.Delta_bar, to prevent
0022 % too large steps.
0023 %
0024 %
0025 % Note also that many of the functions involve solving linear systems in X
0026 % (a point on the manifold), taking matrix exponential and logarithms, etc.
0027 % It could therefore be beneficial to do some precomputation on X (an
0028 % eigenvalue decomposition for example) and store both X and the
0029 % preprocessing in a structure. This would require modifying the present
0030 % factory to work with such structures to represent both points and tangent
0031 % vectors. We omit this in favor of simplicity, but it may be good to keep
0032 % this in mind if efficiency becomes an issue in your application.
0033 
0034 % This file is part of Manopt: www.manopt.org.
0035 % Original author: Bamdev Mishra, August 29, 2013.
0036 % Contributors: Nicolas Boumal
0037 % Change log:
0038 %
0039 %   March 5, 2014 (NB)
0040 %       There were a number of mistakes in the code owing to the tacit
0041 %       assumption that if X and eta are symmetric, then X\eta is
0042 %       symmetric too, which is not the case. See discussion on the Manopt
0043 %       forum started on Jan. 19, 2014. Functions norm, dist, exp and log
0044 %       were modified accordingly. Furthermore, they only require matrix
0045 %       inversion (as well as matrix log or matrix exp), not matrix square
0046 %       roots or their inverse.
0047 %
0048 %   July 28, 2014 (NB)
0049 %       The dim() function returned n*(n-1)/2 instead of n*(n+1)/2.
0050 %       Implemented proper parallel transport from Sra and Hosseini (not
0051 %       used by default).
0052 %       Also added symmetrization in exp and log (to be sure).
0053 %
0054 %   April 3, 2015 (NB):
0055 %       Replaced trace(A*B) by a faster equivalent that does not compute
0056 %       the whole product A*B, for inner product, norm and distance.
0057 %
0058 %   May 23, 2017 (NB):
0059 %       As seen in a talk of Wen Huang at the SIAM Optimization Conference
0060 %       today, replaced the retraction of this factory (which was simply
0061 %       equal to the exponential map) with a simpler, second-order
0062 %       retraction. That this retraction is second order can be verified
0063 %       numerically with checkretraction(sympositivedefinitefactory(5));
0064 %       Notice that, for this retraction, it would be cheap to evaluate for
0065 %       many values of t, that is, it is cheap to retract many points along
0066 %       the same tangent direction. This could in principle be exploited to
0067 %       speed up line-searches.
0068 %
0069 %   Jan. 12, 2022 (NB):
0070 %       Simplified code for ehess2rhess by commenting out the reasoning and
0071 %       computing only the end result.
0072     
0073     symm = @(X) .5*(X+X');
0074     
0075     M.name = @() sprintf('Symmetric positive definite geometry of %dx%d matrices', n, n);
0076     
0077     M.dim = @() n*(n+1)/2;
0078     
0079     % Helpers to avoid computing full matrices simply to extract their trace
0080     vec  = @(A) A(:);
0081     trAB = @(A, B) vec(A')'*vec(B);  % = trace(A*B)
0082     trAA = @(A) sqrt(trAB(A, A));    % = sqrt(trace(A^2))
0083     
0084     % Choice of the metric on the orthonormal space is motivated by the
0085     % symmetry present in the space. The metric on the positive definite
0086     % cone is its natural bi-invariant metric.
0087     % The result is equal to: trace( (X\eta) * (X\zeta) )
0088     M.inner = @(X, eta, zeta) trAB(X\eta, X\zeta);
0089     
0090     % Notice that X\eta is *not* symmetric in general.
0091     % The result is equal to: sqrt(trace((X\eta)^2)).
0092     % Thus, we compute the sum of the squared eigenvalues of X\eta, which
0093     % in general is different from summing the squared singular values:
0094     % that is why it is not equivalent to compute the Frobenius norm here.
0095     % There should be no need to take the real part, but rounding errors
0096     % may cause a small imaginary part to appear, so we discard it.
0097     M.norm = @(X, eta) real(trAA(X\eta));
0098     
0099     % Same here: X\Y is not symmetric in general.
0100     % Same remark about taking the real part.
0101     M.dist = @(X, Y) real(trAA(real(logm(X\Y))));
0102     
0103     
0104     M.typicaldist = @() sqrt(n*(n+1)/2);
0105     
0106     
0107     M.egrad2rgrad = @egrad2rgrad;
0108     function eta = egrad2rgrad(X, eta)
0109         eta = X*symm(eta)*X;
0110     end
0111     
0112     
0113     M.ehess2rhess = @ehess2rhess;
0114     function Hess = ehess2rhess(X, egrad, ehess, eta)
0115         % Directional derivatives of the Riemannian gradient
0116         %  Hess = X*symm(ehess)*X + 2*symm(eta*symm(egrad)*X);
0117         % Correction factor for the non-constant metric
0118         %  Hess = Hess - symm(eta*symm(egrad)*X);
0119         % Combined:
0120         Hess = X*symm(ehess)*X + symm(eta*symm(egrad)*X);
0121     end
0122     
0123     
0124     M.proj = @(X, eta) symm(eta);
0125     
0126     M.tangent = M.proj;
0127     M.tangent2ambient = @(X, eta) eta;
0128     
0129     M.retr = @retraction;
0130     function Y = retraction(X, eta, t)
0131         if nargin < 3
0132             teta = eta;
0133         else
0134             teta = t*eta;
0135         end
0136         % The symm() call is mathematically unnecessary but numerically
0137         % necessary.
0138         Y = symm(X + teta + .5*teta*(X\teta));
0139     end
0140     
0141     M.exp = @exponential;
0142     function Y = exponential(X, eta, t)
0143         if nargin < 3
0144             t = 1.0;
0145         end
0146         % The symm() and real() calls are mathematically not necessary but
0147         % are numerically necessary.
0148         Y = symm(X*real(expm(X\(t*eta))));
0149     end
0150     
0151     M.log = @logarithm;
0152     function H = logarithm(X, Y)
0153         % Same remark regarding the calls to symm() and real().
0154         H = symm(X*real(logm(X\Y)));
0155     end
0156     
0157     M.hash = @(X) ['z' hashmd5(X(:))];
0158     
0159     % Generate a random symmetric positive definite matrix following a
0160     % certain distribution. The particular choice of a distribution is of
0161     % course arbitrary, and specific applications might require different
0162     % ones.
0163     M.rand = @random;
0164     function X = random()
0165         D = diag(1+rand(n, 1));
0166         [Q, R] = qr(randn(n)); %#ok
0167         X = Q*D*Q';
0168     end
0169     
0170     % Generate a uniformly random unit-norm tangent vector at X.
0171     M.randvec = @randomvec;
0172     function eta = randomvec(X)
0173         eta = symm(randn(n));
0174         nrm = M.norm(X, eta);
0175         eta = eta / nrm;
0176     end
0177     
0178     M.lincomb = @matrixlincomb;
0179     
0180     M.zerovec = @(X) zeros(n);
0181     
0182     % Poor man's vector transport: exploit the fact that all tangent spaces
0183     % are the set of symmetric matrices, so that the identity is a sort of
0184     % vector transport. It may perform poorly if the origin and target (X1
0185     % and X2) are far apart though. This should not be the case for typical
0186     % optimization algorithms, which perform small steps.
0187     M.transp = @(X1, X2, eta) eta;
0188     
0189     % For reference, a proper vector transport is given here, following
0190     % work by Sra and Hosseini: "Conic geometric optimisation on the
0191     % manifold of positive definite matrices", in SIAM J. Optim.
0192     % in 2015; also available here: http://arxiv.org/abs/1312.1039
0193     % This will not be used by default. To force the use of this transport,
0194     % execute "M.transp = M.paralleltransp;" on your M returned by the
0195     % present factory.
0196     M.paralleltransp = @parallel_transport;
0197     function zeta = parallel_transport(X, Y, eta)
0198         E = sqrtm(Y/X);
0199         zeta = E*eta*E';
0200     end
0201     
0202     % vec and mat are not isometries, because of the unusual inner metric.
0203     M.vec = @(X, U) U(:);
0204     M.mat = @(X, u) reshape(u, n, n);
0205     M.vecmatareisometries = @() false;
0206     
0207 end
sympositivedefinitefactory

PURPOSE

SYNOPSIS

DESCRIPTION

CROSS-REFERENCE INFORMATION

SUBFUNCTIONS

SOURCE CODE