Home > manopt > manifolds > hyperbolic > hyperbolicfactory.m

hyperbolicfactory

PURPOSE ^

Factory for matrices whose columns live on the hyperbolic manifold

SYNOPSIS ^

function M = hyperbolicfactory(n, m, transposed)

DESCRIPTION ^

 Factory for matrices whose columns live on the hyperbolic manifold

 function M = hyperbolicfactory(n)
 function M = hyperbolicfactory(n, m)
 function M = hyperbolicfactory(n, m, transposed)

 Returns a structure M which describes the hyperbolic manifold in Manopt.
 A point on the manifold is a matrix X of size (n+1)-by-m whose columns
 live on the hyperbolic manifold, that is, for each column x of X, we have

   -x(1)^2 + x(2)^2 + x(3)^2 + ... + x(n+1)^2 = -1.

 The positive branch is selected by M.rand(), that is, x(1) > 0, but all
 tools work on the negative branch as well.

 Equivalently, defining the Minkowski (semi) inner product

   <x, y> = -x(1)y(1) + x(2)y(2) + x(3)y(3) + ... + x(n+1)y(n+1)

 and the induced Minkowski (semi) norm ||x||^2 = <x, x>, we can write
 compactly that each column of X has squared Minkowski norm equal to -1.

 The set of matrices X that satisfy this constraint is a smooth manifold.
 Tangent vectors at X are matrices U of the same size as X. If x and u are
 the kth columns of X and U respectively, then <x, u> = 0.

 This manifold is turned into a Riemannian manifold by restricting the
 Minkowski inner product to each tangent space (a simple calculation
 confirms that this metric is indeed Riemannian and not just semi
 Riemannian, that is, it is positive definite when restricted to each
 tangent space). This is the hyperbolic manifold: for m = 1, all of its
 sectional curvatures are equal to -1. This is called the hyperboloid or
 the Lorentz geometry.

 This manifold is an embedded submanifold of Euclidean space (the set of
 matrices of size (n+1)-by-m equipped with the usual trace inner product).
 Thus, when defining the Euclidean gradient for example (problem.egrad),
 it should be specified as if the function were defined in Euclidean space
 directly. The tool M.egrad2rgrad will automatically convert that gradient
 to the correct Riemannian gradient, as needed to satisfy the metric. The
 same is true for the Euclidean Hessian and other tools that manipulate
 elements in the embedding space.

 Importantly, the resulting manifold is /not/ a Riemannian submanifold of
 Euclidean space, because its metric is not obtained simply by restricting
 the Euclidean metric to the tangent spaces. However, it is a
 semi-Riemannian submanifold of Minkowski space, that is, the set of
 matrices of size (n+1)-by-m equipped with the Minkowski inner product.
 Minkowski space itself can be seen as a (linear) semi-Riemannian manifold
 embedded in Euclidean space. This view is entirely equivalent to the one
 described above (the Riemannian structure of the resulting manifold is
 exactly the same), and it is useful to derive some of the tools this
 factory provides.

 If transposed is set to true (it is false by default), then the matrices
 are transposed: a point X on the manifold is a matrix of size m-by-(n+1)
 and each row is an element in hyperbolic space. It is the same geometry,
 just a different representation.


 Resources:

 1. Nickel and Kiela, "Learning Continuous Hierarchies in the Lorentz
    Model of Hyperbolic Geometry", ICML, 2018.

 2. Wilson and Leimeister, "Gradient descent in hyperbolic space",
    arXiv preprint arXiv:1805.08207 (2018).
 
 3. Pennec, "Hessian of the Riemannian squared distance", HAL INRIA, 2017.

 Ported primarily from the McTorch toolbox at
 https://github.com/mctorch/mctorch.

 See also: poincareballfactory spherefactory obliquefactory obliquecomplexfactory

CROSS-REFERENCE INFORMATION ^

This function calls: This function is called by:

SUBFUNCTIONS ^

SOURCE CODE ^

0001 function M = hyperbolicfactory(n, m, transposed)
0002 % Factory for matrices whose columns live on the hyperbolic manifold
0003 %
0004 % function M = hyperbolicfactory(n)
0005 % function M = hyperbolicfactory(n, m)
0006 % function M = hyperbolicfactory(n, m, transposed)
0007 %
0008 % Returns a structure M which describes the hyperbolic manifold in Manopt.
0009 % A point on the manifold is a matrix X of size (n+1)-by-m whose columns
0010 % live on the hyperbolic manifold, that is, for each column x of X, we have
0011 %
0012 %   -x(1)^2 + x(2)^2 + x(3)^2 + ... + x(n+1)^2 = -1.
0013 %
0014 % The positive branch is selected by M.rand(), that is, x(1) > 0, but all
0015 % tools work on the negative branch as well.
0016 %
0017 % Equivalently, defining the Minkowski (semi) inner product
0018 %
0019 %   <x, y> = -x(1)y(1) + x(2)y(2) + x(3)y(3) + ... + x(n+1)y(n+1)
0020 %
0021 % and the induced Minkowski (semi) norm ||x||^2 = <x, x>, we can write
0022 % compactly that each column of X has squared Minkowski norm equal to -1.
0023 %
0024 % The set of matrices X that satisfy this constraint is a smooth manifold.
0025 % Tangent vectors at X are matrices U of the same size as X. If x and u are
0026 % the kth columns of X and U respectively, then <x, u> = 0.
0027 %
0028 % This manifold is turned into a Riemannian manifold by restricting the
0029 % Minkowski inner product to each tangent space (a simple calculation
0030 % confirms that this metric is indeed Riemannian and not just semi
0031 % Riemannian, that is, it is positive definite when restricted to each
0032 % tangent space). This is the hyperbolic manifold: for m = 1, all of its
0033 % sectional curvatures are equal to -1. This is called the hyperboloid or
0034 % the Lorentz geometry.
0035 %
0036 % This manifold is an embedded submanifold of Euclidean space (the set of
0037 % matrices of size (n+1)-by-m equipped with the usual trace inner product).
0038 % Thus, when defining the Euclidean gradient for example (problem.egrad),
0039 % it should be specified as if the function were defined in Euclidean space
0040 % directly. The tool M.egrad2rgrad will automatically convert that gradient
0041 % to the correct Riemannian gradient, as needed to satisfy the metric. The
0042 % same is true for the Euclidean Hessian and other tools that manipulate
0043 % elements in the embedding space.
0044 %
0045 % Importantly, the resulting manifold is /not/ a Riemannian submanifold of
0046 % Euclidean space, because its metric is not obtained simply by restricting
0047 % the Euclidean metric to the tangent spaces. However, it is a
0048 % semi-Riemannian submanifold of Minkowski space, that is, the set of
0049 % matrices of size (n+1)-by-m equipped with the Minkowski inner product.
0050 % Minkowski space itself can be seen as a (linear) semi-Riemannian manifold
0051 % embedded in Euclidean space. This view is entirely equivalent to the one
0052 % described above (the Riemannian structure of the resulting manifold is
0053 % exactly the same), and it is useful to derive some of the tools this
0054 % factory provides.
0055 %
0056 % If transposed is set to true (it is false by default), then the matrices
0057 % are transposed: a point X on the manifold is a matrix of size m-by-(n+1)
0058 % and each row is an element in hyperbolic space. It is the same geometry,
0059 % just a different representation.
0060 %
0061 %
0062 % Resources:
0063 %
0064 % 1. Nickel and Kiela, "Learning Continuous Hierarchies in the Lorentz
0065 %    Model of Hyperbolic Geometry", ICML, 2018.
0066 %
0067 % 2. Wilson and Leimeister, "Gradient descent in hyperbolic space",
0068 %    arXiv preprint arXiv:1805.08207 (2018).
0069 %
0070 % 3. Pennec, "Hessian of the Riemannian squared distance", HAL INRIA, 2017.
0071 %
0072 % Ported primarily from the McTorch toolbox at
0073 % https://github.com/mctorch/mctorch.
0074 %
0075 % See also: poincareballfactory spherefactory obliquefactory obliquecomplexfactory
0076 
0077 
0078 % This file is part of Manopt: www.manopt.org.
0079 % Original authors: Bamdev Mishra <bamdevm@gmail.com>, Mayank Meghwanshi,
0080 % Pratik Jawanpuria, Anoop Kunchukuttan, and Hiroyuki Kasai Oct 28, 2018.
0081 % Contributors: Nicolas Boumal
0082 % Change log:
0083 %   May 14, 2020 (NB):
0084 %       Clarified comments about distance computation.
0085 %   July 13, 2020 (NB):
0086 %       Added pairmean function.
0087 
0088     % Design note: all functions that are defined here but not exposed
0089     % outside work for non-transposed representations. Only the wrappers
0090     % that eventually expose functionalities handle transposition. This
0091     % makes it easier to compose functions internally.
0092 
0093     if ~exist('m', 'var') || isempty(m)
0094         m = 1;
0095     end 
0096     
0097     if ~exist('transposed', 'var') || isempty(transposed)
0098         transposed = false;
0099     end
0100     
0101     if transposed
0102         trnsp = @(X) X';
0103         trnspstr = ', transposed';
0104     else
0105         trnsp = @(X) X;
0106         trnspstr = '';
0107     end
0108 
0109     M.name = @() sprintf('Hyperbolic manifold H(%d, %d)%s', n, m, trnspstr);
0110     
0111     M.dim = @() n*m;
0112     
0113     M.typicaldist = @() sqrt(n*m);
0114 
0115     % Returns a row vector q such that q(k) is the Minkowski inner product
0116     % of columns U(:, k) and V(:, k). This is defined in all of Minkowski
0117     % space, not only on tangent spaces. In particular, if X is a point on
0118     % the manifold, then inner_minkowski_columns(X, X) should return a
0119     % vector of all -1's.
0120     function q = inner_minkowski_columns(U, V)
0121         q = -U(1, :).*V(1, :) + sum(U(2:end, :).*V(2:end, :), 1);
0122     end
0123     
0124     % Riemannian metric: we sum over the m copies of the hyperbolic
0125     % manifold, each equipped with a restriction of the Minkowski metric.
0126     M.inner = @(X, U, V) sum(inner_minkowski_columns(trnsp(U), trnsp(V)));
0127     
0128     % Mathematically, the Riemannian metric is positive definite, hence
0129     % M.inner always returns a nonnegative number when U is tangent at X.
0130     % Numerically, because the inner product involves a difference of
0131     % positive numbers, round-off may result in a small negative number.
0132     % Taking the max against 0 avoids imaginary results.
0133     M.norm = @(X, U) sqrt(max(M.inner(X, U, U), 0));
0134     
0135     M.dist = @(X, Y) norm(dists(trnsp(X), trnsp(Y)));
0136     % This function returns a row vector of length m such that d(k) is the
0137     % geodesic distance between X(:, k) and Y(:, k).
0138     function d = dists(X, Y)
0139         % Mathematically, each column of U = X-Y has nonnegative squared
0140         % Minkowski norm. To avoid potentially imaginary results due to
0141         % round-off errors, we take the max against 0.
0142         U = X-Y;
0143         mink_sqnorms = max(0, inner_minkowski_columns(U, U));
0144         mink_norms = sqrt(mink_sqnorms);
0145         d = 2*asinh(.5*mink_norms);
0146         % The formula above is equivalent to
0147         % d = max(0, real(acosh(-inner_minkowski_columns(X, Y))));
0148         % but is numerically more accurate when distances are small.
0149         % When distances are large, it is better to use the acosh formula.
0150     end
0151     
0152     M.proj = @(X, U) trnsp(projection(trnsp(X), trnsp(U)));
0153     function PU = projection(X, U)
0154         inners = inner_minkowski_columns(X, U);
0155         PU = U + bsxfun(@times, X, inners);
0156     end
0157     
0158     M.tangent = M.proj;
0159     
0160     % For Riemannian submanifolds, converting the Euclidean gradient into
0161     % the Riemannian gradient amounts to an orthogonal projection. Here
0162     % however, the manifold is not a Riemannian submanifold of Euclidean
0163     % space, hence extra corrections are required to account for the change
0164     % of metric.
0165     M.egrad2rgrad = @(X, egrad) trnsp(egrad2rgrad(trnsp(X), trnsp(egrad)));
0166     function rgrad = egrad2rgrad(X, egrad)
0167         egrad(1, :) = -egrad(1, :);
0168         rgrad = projection(X, egrad);
0169     end
0170     
0171     M.ehess2rhess = @(X, egrad, ehess, U) ...
0172         trnsp(ehess2rhess(trnsp(X), trnsp(egrad), trnsp(ehess), trnsp(U)));
0173     function rhess = ehess2rhess(X, egrad, ehess, U)
0174         egrad(1, :) = -egrad(1, :);
0175         ehess(1, :) = -ehess(1, :);
0176         inners = inner_minkowski_columns(X, egrad);
0177         rhess = projection(X, bsxfun(@times, U, inners) + ehess);
0178     end
0179     
0180     % For the exponential, we cannot separate trnsp() nicely from the main
0181     % function because the third input, t, is optional.
0182     M.exp = @exponential;
0183     function Y = exponential(X, U, t)
0184         X = trnsp(X);
0185         U = trnsp(U);
0186         
0187         if nargin < 3
0188             tU = U;   % corresponds to t = 1
0189         else
0190             tU = t*U;
0191         end
0192         
0193         % Compute the individual Minkowski norms of the columns of U.
0194         mink_inners = inner_minkowski_columns(tU, tU);
0195         mink_norms = sqrt(max(0, mink_inners));
0196         
0197         % Coefficients for the exponential. For b, note that NaN's appear
0198         % when an element of mink_norms is zero, in which case the correct
0199         % convention is to define sinh(0)/0 = 1.
0200         a = cosh(mink_norms);
0201         b = sinh(mink_norms)./mink_norms;
0202         b(isnan(b)) = 1;
0203         
0204         Y = bsxfun(@times, X, a) + bsxfun(@times, tU, b);
0205 
0206         Y = trnsp(Y);
0207     end
0208     
0209     M.retr = M.exp;
0210     
0211     M.log = @(X, Y) trnsp(logarithm(trnsp(X), trnsp(Y)));
0212     function U = logarithm(X, Y)
0213         d = dists(X, Y);
0214         a = d./sinh(d);
0215         a(isnan(a)) = 1;
0216         U = projection(X, bsxfun(@times, Y, a));
0217     end
0218 
0219     M.hash = @(X) ['z' hashmd5(X(:))];
0220     
0221     M.rand = @() trnsp(myrand());
0222     function X = myrand()
0223         X1 = randn(n, m);
0224         x0 = sqrt(1 + sum(X1.^2, 1)); % selects positive branch
0225         X = [x0; X1];
0226     end
0227     
0228     M.normalize = @(X, U) U / M.norm(X, U);
0229     M.randvec = @(X) M.normalize(X, M.proj(X, randn(size(X))));
0230     
0231     M.lincomb = @matrixlincomb;
0232     
0233     M.zerovec = @(X) zeros(size(X));
0234     
0235     M.transp = @(X1, X2, U) M.proj(X2, U);
0236     
0237     M.pairmean = @(x1, x2) M.exp(x1, M.log(x1, x2), .5);
0238    
0239     % vec returns a vector representation of an input tangent vector which
0240     % is represented as a matrix; mat returns the original matrix
0241     % representation of the input vector representation of a tangent
0242     % vector; vec and mat are thus inverse of each other.
0243     vect = @(X) X(:);
0244     M.vec = @(X, U_mat) vect(trnsp(U_mat));
0245     M.mat = @(X, U_vec) trnsp(reshape(U_vec, [n+1, m]));
0246     M.vecmatareisometries = @() false;
0247 
0248 end

Generated on Fri 30-Sep-2022 13:18:25 by m2html © 2005