Description of packing_on_the

0001 function [X, maxdot] = packing_on_the_sphere(d, n, epsilon, X0)
0002 % Return a set of points spread out on the sphere.
0003 %
0004 % function [X, maxdot] = packing_on_the_sphere(d, n, epsilon, X0)
0005 %
0006 % Using optimization on the oblique manifold, that is, the product of
0007 % spheres, this function returns a set of n points with unit norm in R^d in
0008 % the form of a matrix X of size nxd, such that the points are spread out
0009 % on the sphere. Ideally, we would minimize the maximum inner product
0010 % between any two points X(i, :) and X(j, :), i~=j, but that is a nonsmooth
0011 % cost function. Instead, we replace the max function by a classical
0012 % log-sum-exp approximation and (attempt to) solve:
0013 %
0014 % min_{X in OB(d, n)} log( .5*sum_{i~=j} exp( xi'*xj/epsilon ) ),
0015 %
0016 % with xi = X(:, i) and epsilon is some "diffusion constant". As epsilon
0017 % goes to zero, the cost function is a sharper approximation of the max
0018 % function (under some assumptions), but the cost function becomes stiffer
0019 % and hence harder to optimize.
0020 %
0021 % The second output, maxdot, is the maximum inner product between any two
0022 % points in the returned X. This number is the one we truly are trying to
0023 % minimize.
0024 %
0025 % Notice that this cost function is invariant under rotation of X:
0026 % f(X) = f(XQ) for all orthogonal Q in O(d).
0027 % This calls for optimization over the set of symmetric positive
0028 % semidefinite matrices of size n and rank d with unit diagonal, which can
0029 % be thought of as the quotient of the oblique manifold OB(d, n) by O(d):
0030 % See elliptopefactory.
0031 %
0032 % This is known as the Thomson or, more specifically, the Tammes problem:
0033 % http://en.wikipedia.org/wiki/Tammes_problem
0034 % An interesting page by Neil Sloane collecting best known packings is
0035 % available here http://neilsloane.com/packings/
0036 
0037 % This file is part of Manopt and is copyrighted. See the license file.
0038 %
0039 % Main author: Nicolas Boumal, July 2, 2013
0040 % Contributors:
0041 %
0042 % Change log:
0043 %   Aug. 14, 2013 (NB) : Code now compatible to experiment with both the
0044 %                        obliquefactory and the elliptopefactory.
0045 %
0046 %   Jan.  7, 2014 (NB) : Added reference to Neil Sloane's page and the
0047 %                        maxdot output.
0048 %
0049 %   June 24, 2014 (NB) : Now shifting exponentials to alleviate numerical
0050 %                        trouble when epsilon is too small.
0051 %
0052 %   Aug. 31, 2021 (XJ) : Added AD to compute the gradient
0053 
0054     if ~exist('d', 'var') || isempty(d)
0055         % Dimension of the embedding space: R^d
0056         d = 3;
0057     end
0058     if ~exist('n', 'var') || isempty(n)
0059         % Number n of points to place of the sphere in R^d.
0060         % For example, n=12 yields an icosahedron:
0061         % https://en.wikipedia.org/wiki/Icosahedron
0062         % Notice though that platonic solids are not always optimal.
0063         % Try for example n = 8: you don't get a cube.
0064         n = 24;
0065     end
0066     if ~exist('epsilon', 'var') || isempty(epsilon)
0067         % This value should be as close to 0 as affordable.
0068         % If it is too close to zero, optimization first becomes much
0069         % slower, than simply doesn't work anymore becomes of floating
0070         % point overflow errors (NaN's and Inf's start to appear).
0071         % If it is too large, then log-sum-exp is a poor approximation of
0072         % the max function, and the spread will be less uniform.
0073         % An okay value seems to be 0.01 or 0.001 for example. Note that a
0074         % better strategy than using a small epsilon straightaway is to
0075         % reduce epsilon bit by bit and to warm-start subsequent
0076         % optimization in that way. Trustregions will be more appropriate
0077         % for these fine tunings.
0078         epsilon = 0.0015;
0079     end
0080     
0081     % Pick your manifold (the elliptope factory quotients out the global
0082     % rotation invariance of the problem, which is more natural but
0083     % conceptually a bit more complicated --- for usage with the toolbox it
0084     % is the same though: just uncomment the appropriate line).
0085     manifold = obliquefactory(d, n, true);
0086     % manifold = elliptopefactory(n, d);
0087     
0088     % Generate a random initial guess if none was given.
0089     if ~exist('X0', 'var') || isempty(X0)
0090         X0 = manifold.rand();
0091     end
0092 
0093     % Define the cost function with caching system used: the store
0094     % structure we receive as input is tied to the input point X. Everytime
0095     % this cost function is called at this point X, we will receive the
0096     % same store structure back. We may modify the store structure inside
0097     % the function and return it: the changes will be remembered for next
0098     % time.
0099     function [f, store] = cost(X, store)
0100         if ~isfield(store, 'ready')
0101             XXt = X*X';
0102             % Shift the exponentials by the maximum value to reduce
0103             % numerical trouble due to possible overflows.
0104             s = max(max(triu(XXt, 1)));
0105             expXXt = exp((XXt-s)/epsilon);
0106             % Zero out the diagonal
0107             expXXt(1:(n+1):end) = 0;
0108             u = sum(sum(triu(expXXt, 1)));
0109             store.XXt = XXt;
0110             store.s = s;
0111             store.expXXt = expXXt;
0112             store.u = u;
0113             store.ready = true;
0114         end
0115         u = store.u;
0116         s = store.s;
0117         f = s + epsilon*log(u);
0118     end
0119 
0120     % Define the gradient of the cost. When the gradient is called at a
0121     % point X for which the cost was already called, the store structure we
0122     % receive remember everything that the cost function stored in it, so
0123     % we can reuse previously computed elements.
0124     function [g, store] = grad(X, store)
0125         if ~isfield(store, 'ready')
0126             [~, store] = cost(X, store);
0127         end
0128         % Compute the Euclidean gradient
0129         eg = store.expXXt*X / store.u;
0130         % Convert to the Riemannian gradient (by projection)
0131         g = manifold.egrad2rgrad(X, eg);
0132     end
0133 
0134     % Setup the problem structure with its manifold M and cost+grad
0135     % functions.
0136     problem.M = manifold;
0137     problem.cost = @cost;
0138     problem.grad = @grad;
0139 
0140     % An alternative way to compute the grad is to use automatic
0141     % differentiation provided in the deep learning toolbox (slower)
0142     % Notice that the function triu is not supported for AD so far.
0143     % Replace it with ctriu described in the file manoptADhelp.m
0144     % problem.cost = @cost_AD;
0145     %    function f = cost_AD(X)
0146     %        XXt = X*X';
0147     %        s = max(max(ctriu(XXt, 1)));
0148     %        expXXt = exp((XXt-s)/epsilon);
0149     %        expXXt(1:(n+1):end) = 0;
0150     %        u = sum(sum(ctriu(expXXt, 1)));
0151     %        f = s + epsilon*log(u);
0152     %    end
0153     % Call manoptAD to prepare AD for the problem structure
0154     % problem = manoptAD(problem,'egrad');
0155     
0156     % For debugging, it's always nice to check the gradient a few times.
0157     % checkgradient(problem);
0158     % pause;
0159     
0160     % Call a solver on our problem with a few options defined. We did not
0161     % specify the Hessian but it is still okay to call trustregion: Manopt
0162     % will approximate the Hessian with finite differences of the gradient.
0163     opts.tolgradnorm = 1e-8;
0164     opts.maxtime = 1200;
0165     opts.maxiter = 1e5;
0166     % X = trustregions(problem, X0, opts);
0167     X = conjugategradient(problem, X0, opts);
0168     
0169     % Evaluate the maximum inner product between any two points of X.
0170     XXt = X*X';
0171     dots = XXt(find(triu(ones(n), 1))); %#ok<FNDSB>
0172     maxdot = max(dots);
0173     
0174     % Similarly, even though we did not specify the Hessian, we may still
0175     % estimate its spectrum at the solution. It should reflect the
0176     % invariance of the cost function under a global rotatioon of the
0177     % sphere, which is an invariance under the group O(d) of dimension
0178     % d(d-1)/2 : this translates into d(d-1)/2 zero eigenvalues in the
0179     % spectrum of the Hessian.
0180     % The approximate Hessian is not a linear operator, and is it a
0181     % fortiori not symmetric. The result of this computation is thus not
0182     % reliable. It does display the zero eigenvalues as expected though.
0183     if manifold.dim() < 300
0184         evs = real(hessianspectrum(problem, X));
0185         figure;
0186         stem(1:length(evs), sort(evs), '.');
0187         title(['Eigenvalues of the approximate Hessian of the cost ' ...
0188                'function at the solution']);
0189     end
0190     
0191     
0192     % Show how the inner products X(:, i)'*X(:, j) are distributed.
0193     figure;
0194     hist(real(acos(dots)), 20);
0195     title('Histogram of the geodesic distances');
0196     
0197     % This is the quantity we actually want to minimize.
0198     fprintf('Maximum inner product between two points: %g\n', maxdot);
0199     
0200     
0201     % Give some visualization if the dimension allows
0202     if d == 2
0203         % For the circle, the optimal solution consists in spreading the
0204         % points with angles uniformly sampled in (0, 2pi). This
0205         % corresponds to the following value for the max inner product:
0206         fprintf('Optimal value for the max inner product: %g\n', cos(2*pi/n));
0207         figure;
0208         t = linspace(-pi, pi, 201);
0209         plot(cos(t), sin(t), '-', 'LineWidth', 3, 'Color', [152,186,220]/255);
0210         daspect([1 1 1]);
0211         box off;
0212         axis off;
0213         hold on;
0214         plot(X(:, 1), X(:, 2), 'r.', 'MarkerSize', 25);
0215         hold off;
0216     end
0217     if d == 3
0218         figure;
0219         % Plot the sphere
0220         [sphere_x, sphere_y, sphere_z] = sphere(50);
0221         handle = surf(sphere_x, sphere_y, sphere_z);
0222         set(handle, 'FaceColor', [152,186,220]/255);
0223         set(handle, 'FaceAlpha', .5);
0224         set(handle, 'EdgeColor', [152,186,220]/255);
0225         set(handle, 'EdgeAlpha', .5);
0226         daspect([1 1 1]);
0227         box off;
0228         axis off;
0229         hold on;
0230         % Add the chosen points
0231         Y = 1.02*X';
0232         plot3(Y(1, :), Y(2, :), Y(3, :), 'r.', 'MarkerSize', 25);
0233         % And connect the points which are at minimal distance,
0234         % within some tolerance.
0235         min_distance = real(acos(maxdot));
0236         connected = real(acos(XXt)) <= 1.20*min_distance;
0237         [Ic, Jc] = find(triu(connected, 1));
0238         for k = 1 : length(Ic)
0239             i = Ic(k); j = Jc(k);
0240             plot3(Y(1, [i j]), Y(2, [i j]), Y(3, [i j]), 'k-');
0241         end
0242         hold off;
0243     end
0244 
0245 end
packing_on_the_sphere

PURPOSE

SYNOPSIS

DESCRIPTION

CROSS-REFERENCE INFORMATION

SUBFUNCTIONS

SOURCE CODE