Home > examples > maxcut.m

maxcut

PURPOSE ^

Algorithm to (try to) compute a maximum cut of a graph, via SDP approach.

SYNOPSIS ^

function [x, cutvalue, cutvalue_upperbound, Y] = maxcut(L, r)

DESCRIPTION ^

 Algorithm to (try to) compute a maximum cut of a graph, via SDP approach.
 
 function x = maxcut(L)
 function [x, cutvalue, cutvalue_upperbound, Y] = maxcut(L, r)

 L is the Laplacian matrix describing the graph to cut. The Laplacian of a
 graph is L = D - A, where D is the diagonal degree matrix (D(i, i) is the
 sum of the weights of the edges adjacent to node i) and A is the
 symmetric adjacency matrix of the graph (A(i, j) = A(j, i) is the weight
 of the edge joining nodes i and j). If L is sparse, this will be
 exploited.

 If the graph has n nodes, then L is nxn and the output x is a vector of
 length n such that x(i) is +1 or -1. This partitions the nodes of the
 graph in two classes, in an attempt to maximize the sum of the weights of
 the edges that go from one class to the other (MAX CUT problem).

 cutvalue is the sum of the weights of the edges 'cut' by the partition x.

 If the algorithm reached the global optimum of the underlying SDP
 problem, then it produces an upperbound on the maximum cut value. This
 value is returned in cutvalue_upperbound if it is found. Otherwise, that
 output is set to NaN.

 If r is specified (by default, r = n), the algorithm will stop at rank r.
 This may prevent the algorithm from reaching a globally optimal solution
 for the underlying SDP problem (but can greatly help in keeping the
 execution time under control). If a global optimum of the SDP is reached
 before rank r, the algorithm will stop of course.

 Y is a matrix of size nxp, with p <= r, such that X = Y*Y' is the best
 solution found for the underlying SDP problem. If cutvalue_upperbound is
 not NaN, then Y*Y' is optimal for the SDP and cutvalue_upperbound is its
 cut value.
 
 By Goemans and Williamson 1995, it is known that if the optimal value of
 the SDP is reached, then the returned cut, in expectation, is at most at
 a fraction 0.878 of the optimal cut. (This is not exactly valid because
 we do not use random projection here; sign(Y*randn(size(Y, 2), 1)) will
 give a cut that respects this statement -- it's usually worse though).

 The algorithm is essentially that of:
 Journee, Bach, Absil and Sepulchre, SIAM 2010
 Low-rank optimization on the cone of positive semidefinite matrices.

 It is itself based on the famous SDP relaxation of MAX CUT:
 Goemans and Williamson, 1995
 Improved approximation algorithms for maximum cut and satisfiability
 problems using semidefinite programming.
 
 See also: elliptope_SDP

CROSS-REFERENCE INFORMATION ^

This function calls: This function is called by:

SUBFUNCTIONS ^

SOURCE CODE ^

0001 function [x, cutvalue, cutvalue_upperbound, Y] = maxcut(L, r)
0002 % Algorithm to (try to) compute a maximum cut of a graph, via SDP approach.
0003 %
0004 % function x = maxcut(L)
0005 % function [x, cutvalue, cutvalue_upperbound, Y] = maxcut(L, r)
0006 %
0007 % L is the Laplacian matrix describing the graph to cut. The Laplacian of a
0008 % graph is L = D - A, where D is the diagonal degree matrix (D(i, i) is the
0009 % sum of the weights of the edges adjacent to node i) and A is the
0010 % symmetric adjacency matrix of the graph (A(i, j) = A(j, i) is the weight
0011 % of the edge joining nodes i and j). If L is sparse, this will be
0012 % exploited.
0013 %
0014 % If the graph has n nodes, then L is nxn and the output x is a vector of
0015 % length n such that x(i) is +1 or -1. This partitions the nodes of the
0016 % graph in two classes, in an attempt to maximize the sum of the weights of
0017 % the edges that go from one class to the other (MAX CUT problem).
0018 %
0019 % cutvalue is the sum of the weights of the edges 'cut' by the partition x.
0020 %
0021 % If the algorithm reached the global optimum of the underlying SDP
0022 % problem, then it produces an upperbound on the maximum cut value. This
0023 % value is returned in cutvalue_upperbound if it is found. Otherwise, that
0024 % output is set to NaN.
0025 %
0026 % If r is specified (by default, r = n), the algorithm will stop at rank r.
0027 % This may prevent the algorithm from reaching a globally optimal solution
0028 % for the underlying SDP problem (but can greatly help in keeping the
0029 % execution time under control). If a global optimum of the SDP is reached
0030 % before rank r, the algorithm will stop of course.
0031 %
0032 % Y is a matrix of size nxp, with p <= r, such that X = Y*Y' is the best
0033 % solution found for the underlying SDP problem. If cutvalue_upperbound is
0034 % not NaN, then Y*Y' is optimal for the SDP and cutvalue_upperbound is its
0035 % cut value.
0036 %
0037 % By Goemans and Williamson 1995, it is known that if the optimal value of
0038 % the SDP is reached, then the returned cut, in expectation, is at most at
0039 % a fraction 0.878 of the optimal cut. (This is not exactly valid because
0040 % we do not use random projection here; sign(Y*randn(size(Y, 2), 1)) will
0041 % give a cut that respects this statement -- it's usually worse though).
0042 %
0043 % The algorithm is essentially that of:
0044 % Journee, Bach, Absil and Sepulchre, SIAM 2010
0045 % Low-rank optimization on the cone of positive semidefinite matrices.
0046 %
0047 % It is itself based on the famous SDP relaxation of MAX CUT:
0048 % Goemans and Williamson, 1995
0049 % Improved approximation algorithms for maximum cut and satisfiability
0050 % problems using semidefinite programming.
0051 %
0052 % See also: elliptope_SDP
0053 
0054 % This file is part of Manopt: www.manopt.org.
0055 % Original author: Nicolas Boumal, July 18, 2013
0056 % Contributors:
0057 % Change log:
0058 %
0059 %   April 3, 2015 (NB):
0060 %       L products now counted with the new shared memory system. This is
0061 %       more reliable and more flexible than using a global variable.
0062 
0063 
0064     % If no inputs are provided, generate a random graph Laplacian.
0065     % This is for illustration purposes only.
0066     if ~exist('L', 'var') || isempty(L)
0067         n = 20;
0068         A = triu(randn(n) <= .4, 1);
0069         A = A+A';
0070         D = diag(sum(A, 2));
0071         L = D-A;
0072     end
0073 
0074 
0075     n = size(L, 1);
0076     assert(size(L, 2) == n, 'L must be square.');
0077 
0078     if ~exist('r', 'var') || isempty(r) || r > n
0079         r = n;
0080     end
0081     
0082     % We will let the rank increase. Each rank value will generate a cut.
0083     % We have to go up in the rank to eventually find a certificate of SDP
0084     % optimality. This in turn will provide an upperbound on the MAX CUT
0085     % value and ensure that we're doing well, according to Goemans and
0086     % Williamson's argument. In practice though, the good cuts often come
0087     % up for low rank values, so we better keep track of the best one.
0088     best_x = ones(n, 1);
0089     best_cutvalue = 0;
0090     cutvalue_upperbound = NaN;
0091     
0092     time = [];
0093     cost = [];
0094     
0095     for rr = 2 : r
0096         
0097         manifold = elliptopefactory(n, rr);
0098         
0099         if rr == 2
0100             
0101             % At first, for rank 2, generate a random point.
0102             Y0 = manifold.rand();
0103              
0104         else
0105             
0106             % To increase the rank, we could just add a column of zeros to
0107             % the Y matrix. Unfortunately, this lands us in a saddle point.
0108             % To escape from the saddle, we may compute an eigenvector of
0109             % Sy associated to a negative eigenvalue: that will yield a
0110             % (second order) descent direction Z. See Journee et al ; Sy is
0111             % linked to dual certificates for the SDP.
0112             Y0 = [Y zeros(n, 1)];
0113             LY0 = L*Y0;
0114             Dy = spdiags(sum(LY0.*Y0, 2), 0, n, n);
0115             Sy = (Dy - L)/4;
0116             % Find the smallest (the "most negative") eigenvalue of Sy.
0117             eigsopts.isreal = true;
0118             [v, s] = eigs(Sy, 1, 'SA', eigsopts);
0119             % If there is no negative eigenvalue for Sy, than we are not at
0120             % a saddle point: we're actually done!
0121             if s >= -1e-8
0122                 % We can stop here: we found the global optimum of the SDP,
0123                 % and hence the reached cost is a valid upper bound on the
0124                 % maximum cut value.
0125                 cutvalue_upperbound = max(-[info.cost]);
0126                 break;
0127             end
0128             
0129             % This is our escape direction.
0130             Z = manifold.proj(Y0, [zeros(n, rr-1) v]);
0131             
0132             % % These instructions can be uncommented to see what the cost
0133             % % function looks like at a saddle point. But will require the
0134             % % problem structure which is not defined here: see the helper
0135             % % function.
0136             % plotprofile(problem, Y0, Z, linspace(-1, 1, 101));
0137             % drawnow; pause;
0138             
0139             % Now make a step in the Z direction to escape from the saddle.
0140             % It is not obvious that it is ok to do a unit step ... perhaps
0141             % need to be cautious here with the stepsize. It's not too
0142             % critical though: the important point is to leave the saddle
0143             % point. But it's nice to guarantee monotone decrease of the
0144             % cost, and we can't do that with a constant step (at least,
0145             % not without a proper argument to back it up).
0146             stepsize = 1;
0147             Y0 = manifold.retr(Y0, Z, stepsize);
0148             
0149         end
0150         
0151         % Use the Riemannian optimization based algorithm lower in this
0152         % file to reach a critical point (typically a local optimizer) of
0153         % the max cut cost with fixed rank, starting from Y0.
0154         [Y, info] = maxcut_fixedrank(L, Y0);
0155         
0156         % Some info logging.
0157         thistime = [info.time];
0158         if ~isempty(time)
0159             thistime = time(end) + thistime;
0160         end
0161         time = [time thistime]; %#ok<AGROW>
0162         cost = [cost [info.cost]]; %#ok<AGROW>
0163 
0164         % Time to turn the matrix Y into a cut.
0165         % We can either do the random rounding as follows:
0166         % x = sign(Y*randn(rr, 1));
0167         % or extract the "PCA direction" of the points in Y and cut
0168         % orthogonally to that direction, as follows (seems faster than
0169         % calling svds):
0170         [U, ~, ~] = svd(Y, 0);
0171         u = U(:, 1);
0172         x = sign(u);
0173 
0174         cutvalue = (x'*L*x)/4;
0175         if cutvalue > best_cutvalue
0176             best_x = x;
0177             best_cutvalue = cutvalue;
0178         end
0179         
0180     end
0181     
0182     x = best_x;
0183     cutvalue = best_cutvalue;
0184     
0185     plot(time, -cost, '.-');
0186     xlabel('Time [s]');
0187     ylabel('Relaxed cut value');
0188     title('The relaxed cut value is an upper bound on the optimal cut value.');
0189 
0190 end
0191 
0192 
0193 function [Y, info] = maxcut_fixedrank(L, Y)
0194 % Try to solve the (fixed) rank r relaxed max cut program, based on the
0195 % Laplacian of the graph L and an initial guess Y. L is nxn and Y is nxr.
0196 
0197     [n, r] = size(Y);
0198     assert(all(size(L) == n));
0199     
0200     % The fixed rank elliptope geometry describes symmetric, positive
0201     % semidefinite matrices of size n with rank r and all diagonal entries
0202     % are 1.
0203     manifold = elliptopefactory(n, r);
0204     
0205     % % If you want to compare the performance of the elliptope geometry
0206     % % against the (conceptually simpler) oblique manifold geometry,
0207     % % uncomment this line.
0208     % manifold = obliquefactory(r, n, true);
0209     
0210     problem.M = manifold;
0211     
0212     % % For rapid prototyping, these lines suffice to describe the cost
0213     % % function and its gradient and Hessian (here expressed using the
0214     % % Euclidean gradient and Hessian).
0215     % problem.cost  = @(Y)  -trace(Y'*L*Y)/4;
0216     % problem.egrad = @(Y) -(L*Y)/2;
0217     % problem.ehess = @(Y, U) -(L*U)/2;
0218     
0219     % Instead of the prototyping version, the functions below describe the
0220     % cost, gradient and Hessian using the caching system (the store
0221     % structure). This alows to execute exactly the required number of
0222     % multiplications with the matrix L. These multiplications are counted
0223     % using the shared memory in the store structure: that memory is
0224     % shared , so we get access to the same data, regardless of the
0225     % point Y currently visited.
0226 
0227     % For every visited point Y, we will need L*Y. This function makes sure
0228     % the quantity L*Y is available, but only computes it if it wasn't
0229     % already computed.
0230     function store = prepare(Y, store)
0231         if ~isfield(store, 'LY')
0232             % Compute and store the product for the current point Y.
0233             store.LY = L*Y;
0234             % Create / increment the shared counter (independent of Y).
0235             if isfield(store.shared, 'counter')
0236                 store.shared.counter = store.shared.counter + 1;
0237             else
0238                 store.shared.counter = 1;
0239             end
0240         end
0241     end
0242 
0243     problem.cost = @cost;
0244     function [f, store] = cost(Y, store)
0245         store = prepare(Y, store);
0246         LY = store.LY;
0247         f = -(Y(:)'*LY(:))/4; % = -trace(Y'*LY)/4; but faster
0248     end
0249 
0250     problem.egrad = @egrad;
0251     function [g, store] = egrad(Y, store)
0252         store = prepare(Y, store);
0253         LY = store.LY;
0254         g = -LY/2;
0255     end
0256 
0257     problem.ehess = @ehess;
0258     function [h, store] = ehess(Y, U, store)
0259         store = prepare(Y, store); % this line is not strictly necessary
0260         LU = L*U;
0261         store.shared.counter = store.shared.counter + 1;
0262         h = -LU/2;
0263     end
0264 
0265     % statsfun is called exactly once after each iteration (including after
0266     % the evaluation of the cost at the initial guess). We then register
0267     % the value of the L-products counter (which counts how many products
0268     % were needed so far).
0269     % options.statsfun = @statsfun;
0270     % function stats = statsfun(problem, Y, stats, store) %#ok
0271     %     stats.Lproducts = store.shared.counter;
0272     % end
0273     % Equivalent, but simpler syntax:
0274     options.statsfun = statsfunhelper('Lproducts', ...
0275                      @(problem, Y, stats, store) store.shared.counter );
0276     
0277 
0278     % % Diagnostics tools: to make sure the gradient and Hessian are
0279     % % correct during the prototyping stage.
0280     % checkgradient(problem); pause;
0281     % checkhessian(problem); pause;
0282     
0283     % % To investigate the effect of the rotational invariance when using
0284     % % the oblique or the elliptope geometry, or to study the saddle point
0285     % % issue mentioned above, it is sometimes interesting to look at the
0286     % % spectrum of the Hessian. For large dimensions, this is slow!
0287     % stairs(sort(hessianspectrum(problem, Y)));
0288     % drawnow; pause;
0289     
0290     
0291     % % When facing a saddle point issue as described in the master
0292     % % function, and when no sure mechanism exists to find an escape
0293     % % direction, it may be helpful to set useRand to true and raise
0294     % % miniter to more than 1, when using trustregions. This will tell the
0295     % % solver to not stop before at least miniter iterations were
0296     % % accomplished (thus disregarding the zero gradient at the saddle
0297     % % point) and to use random search directions to kick start the inner
0298     % % solve (tCG) step. It is not as efficient as finding a sure escape
0299     % % direction, but sometimes it's the best we have.
0300     % options.useRand = true;
0301     % options.miniter = 5;
0302     
0303     options.verbosity = 2;
0304     [Y, Ycost, info] = trustregions(problem, Y, options); %#ok<ASGLU>
0305     
0306     fprintf('Products with L: %d\n', max([info.Lproducts]));
0307 
0308 end

Generated on Mon 10-Sep-2018 11:48:06 by m2html © 2005