Manopt example on how to use GPU with manifold factories that allow it. We are still working on this feature, and so far only few factories have been adapted to work on GPU. But the adaptations are rather easy. If there is a manifold you'd like to use on GPU, let us know via the forum on http://www.manopt.org, we'll be happy to help! See also: spherefactory stiefelfactory grassmannfactory complexcirclefactory
0001 function using_gpu() 0002 % Manopt example on how to use GPU with manifold factories that allow it. 0003 % 0004 % We are still working on this feature, and so far only few factories have 0005 % been adapted to work on GPU. But the adaptations are rather easy. If 0006 % there is a manifold you'd like to use on GPU, let us know via the forum 0007 % on http://www.manopt.org, we'll be happy to help! 0008 % 0009 % See also: spherefactory stiefelfactory grassmannfactory complexcirclefactory 0010 0011 % This file is part of Manopt: www.manopt.org. 0012 % Original author: Nicolas Boumal, Aug. 3, 2018. 0013 % Contributors: 0014 % Change log: 0015 0016 0017 if exist('OCTAVE_VERSION', 'builtin') 0018 warning('manopt:usinggpu', 'Octave does not handle GPUs at this time.'); 0019 return; 0020 end 0021 0022 if gpuDeviceCount() <= 0 0023 warning('manopt:usinggpu', 'No GPU available: cannot run example.'); 0024 return; 0025 end 0026 0027 % Construct a large problem to illustrate the use of GPU. 0028 % Below, we will compute p left-most eigenvectors of A (symmetric). 0029 % On a particular test computer, we found that for n = 100, 1000, CPU 0030 % is faster, but for n = 10000, GPU tends to be 10x faster. 0031 p = 3; 0032 n = 10000; 0033 A = randn(n); 0034 A = A+A'; 0035 0036 inner = @(U, V) U(:)'*V(:); 0037 0038 % First, setup and run the optimization problem on the CPU. 0039 problem.M = grassmannfactory(n, p, 1); % 1 copy of Grassmann(n, p) 0040 problem.cost = @(X) .5*inner(X, A*X); % Rayleigh quotient to be minimized 0041 problem.egrad = @(X) A*X; % Could use caching to save here 0042 problem.ehess = @(X, Xdot) A*Xdot; 0043 X0 = problem.M.rand(); % Random initial guess 0044 tic_cpu = tic(); 0045 X_cpu = trustregions(problem, X0); % run any solver 0046 time_cpu = toc(tic_cpu); 0047 0048 % Then, move the data to the GPU, redefine the problem using the moved 0049 % data, activate the GPU flag in the factory, and run it again. 0050 A = gpuArray(A); 0051 problem.M = grassmannfactory(n, p, 1, true); % true is the GPU flag; 0052 problem.cost = @(X) .5*inner(X, A*X); % Code for cost and gradient etc. 0053 problem.egrad = @(X) A*X; % basically didn't change, but 0054 problem.ehess = @(X, Xdot) A*Xdot; % operates on gpuArrays now. 0055 X0 = gpuArray(X0); 0056 tic_gpu = tic(); 0057 X_gpu = trustregions(problem, X0); 0058 time_gpu = toc(tic_gpu); 0059 0060 fprintf('Total time CPU: %g\nTotal time GPU: %g\nSolution difference: %g\n', ... 0061 time_cpu, time_gpu, norm(X_cpu - X_gpu, 'fro')); 0062 0063 end