Description of positive_definite_intrinsic

 Computes an intrinsic mean of a collection of positive definite matrices.

 function X = positive_definite_intrinsic_mean(A)

 Input:  A 3D matrix A of size nxnxm such that each slice A(:, :, k) is a
         positive definite matrix of size nxn.
 
 Output: A positive definite matrix X of size nxn which is an intrinsic mean
         of the m matrices in A, that is, X minimizes the sum of squared
         Riemannian distances to the matrices in A:
            f(X) = sum_k=1^m .5*dist^2(X, A(:, :, k))
         The distance is defined by the natural metric on the set of
         positive definite matrices: see sympositivedefinitefactory.
 
 This simple example is not the best way to compute intrinsic means. Its
 purpose it to serve as base code to explore other algorithms. In
 particular, in the presence of large noise, this algorithm seems not to
 be able to reach points with a very small gradient norm. This may be
 caused by insufficient accuracy in the gradient computation.

 See also: sympositivedefinitefactory

0001 function X = positive_definite_intrinsic_mean(A)
0002 % Computes an intrinsic mean of a collection of positive definite matrices.
0003 %
0004 % function X = positive_definite_intrinsic_mean(A)
0005 %
0006 % Input:  A 3D matrix A of size nxnxm such that each slice A(:, :, k) is a
0007 %         positive definite matrix of size nxn.
0008 %
0009 % Output: A positive definite matrix X of size nxn which is an intrinsic mean
0010 %         of the m matrices in A, that is, X minimizes the sum of squared
0011 %         Riemannian distances to the matrices in A:
0012 %            f(X) = sum_k=1^m .5*dist^2(X, A(:, :, k))
0013 %         The distance is defined by the natural metric on the set of
0014 %         positive definite matrices: see sympositivedefinitefactory.
0015 %
0016 % This simple example is not the best way to compute intrinsic means. Its
0017 % purpose it to serve as base code to explore other algorithms. In
0018 % particular, in the presence of large noise, this algorithm seems not to
0019 % be able to reach points with a very small gradient norm. This may be
0020 % caused by insufficient accuracy in the gradient computation.
0021 %
0022 % See also: sympositivedefinitefactory
0023 
0024 % This file is part of Manopt and is copyrighted. See the license file.
0025 %
0026 % Main author: Nicolas Boumal, Sept. 3, 2013
0027 % Contributors:
0028 %
0029 % Change log:
0030 %     Sep. 15, 2022 (NB):
0031 %         Changed name from positive_definite_karcher_mean, clarified
0032 %         some comments.
0033     
0034     % Generate some random data to test the function if none is given.
0035     if ~exist('A', 'var') || isempty(A)
0036         n = 5;
0037         m = 50;
0038         A = zeros(n, n, m);
0039         ref = diag(max(.1, 1+.1*randn(n, 1)));
0040         for i = 1 : m
0041             noise = 0.01*randn(n);
0042             noise = (noise + noise')/2;
0043             [V, D] = eig(ref + noise);
0044             A(:, :, i) = V*diag(max(.01, diag(D)))*V';
0045         end
0046     end
0047     
0048     % Retrieve the size of the problem:
0049     % There are m matrices of size nxn to average.
0050     n = size(A, 1);
0051     m = size(A, 3);
0052     assert(n == size(A, 2), ...
0053            ['The slices of A must be square, i.e., the ' ...
0054             'first and second dimensions of A must be equal.']);
0055     
0056     % Our search space is the set of positive definite matrices of size n.
0057     % Notice that this is the only place we specify on which manifold we
0058     % wish to compute Karcher means. Replacing this factory for another
0059     % geometry will yield code to compute Karcher means on that other
0060     % manifold, provided that manifold is equipped with a dist function and
0061     % a logarithmic map log.
0062     M = sympositivedefinitefactory(n);
0063     
0064     % Define a problem structure, specifying the manifold M, the cost
0065     % function and its gradient.
0066     problem.M = M;
0067     problem.cost = @cost;
0068     problem.grad = @grad;
0069     
0070     % Explicitly pick an approximate Hessian for the trust-region method.
0071     % (This is only to show an example of how it can be done; the solver
0072     % below, rlbfgs, does not use the approximate Hessian; trustregions
0073     % would, but it would figure it out automatically with default
0074     % stepsize if the line below is omitted.)
0075     problem.approxhess = approxhessianFD(problem, struct('stepsize', 1e-4));
0076     
0077     % The functions below make many redundant computations. This
0078     % performance hit can be alleviated by using the caching system. We go
0079     % for a simple implementation here, as a tutorial example.
0080     
0081     % Cost function
0082     function f = cost(X)
0083         f = 0;
0084         for k = 1 : m
0085             f = f + M.dist(X, A(:, :, k))^2;
0086         end
0087         f = f/(2*m);
0088     end
0089 
0090     % Riemannian gradient of the cost function
0091     function g = grad(X)
0092         g = M.zerovec(X);
0093         for k = 1 : m
0094             % Update g in a linear combination of the form
0095             % g = g - [something]/m.
0096             g = M.lincomb(X, 1, g, -1/m, M.log(X, A(:, :, k)));
0097         end
0098     end
0099     
0100     % Execute some checks on the derivatives for early debugging.
0101     % These things can be commented out of course.
0102     % The slopes should agree on part of the plot at least. In this case,
0103     % it is sometimes necessary to inspect the plot visually to make the
0104     % call, but it is indeed correct.
0105     % checkgradient(problem);
0106     % pause;
0107     
0108     % Execute this if you want to force using a proper parallel vector
0109     % transport. This is not necessary. If you omit this, the default
0110     % vector transport is the identity map, which is (of course) cheaper
0111     % and seems to perform well in practice.
0112     % M.transp = M.paralleltransp;
0113     
0114     % Issue a call to a solver. Default options are selected.
0115     % Our initial guess is the first data point. Most solvers work well
0116     % with this problem. Limited-memory BFGS is one good example:
0117     X = rlbfgs(problem, A(:, :, 1));
0118 
0119 end

positive_definite_intrinsic_mean

PURPOSE

SYNOPSIS

DESCRIPTION

CROSS-REFERENCE INFORMATION

SUBFUNCTIONS

SOURCE CODE