Home > examples > robust_pca.m

# robust_pca

## PURPOSE Computes a robust version of PCA (principal component analysis) on data.

## SYNOPSIS function [U, cost] = robust_pca(X, d)

## DESCRIPTION ``` Computes a robust version of PCA (principal component analysis) on data.

function [U, cost] = robustpca(X, d)

Given a matrix X of size p by n, such that each column represents a
point in R^p, this computes U: an orthonormal basis of size p by d such
that the column space of U captures the points X as well as possible.
More precisely, the function attempts to compute U as the minimizer
over the Grassmann manifold (the set of linear subspaces) of:

f(U) = (1/n) Sum_{i = 1:n} dist(X(:, i), the space spanned by U)
= (1/n) Sum_{i = 1:n} || U*U'*X(:, i) - X(:, i) ||

The output cost represents the average distance achieved with the
returned U. Notice that norms are not squared, for robustness.

In practice, because this function is nonsmooth, it is smoothed with a
pseudo-Huber loss function of parameter epsilon (noted e for short), and
the smoothing parameter is iteratively reduced (with warm starts):

f_e(U) = (1/n) Sum_{i = 1:n} l_e(|| U*U'*X(:, i) - X(:, i) ||)

with l_e(x) = sqrt(x^2 + e^2) - e (for e = 0, this is absolute value).

The intermediate optimization of the smooth cost over the Grassmann
manifold is performed using the Manopt toolbox.

Ideally, the non-outlier data should be centered. If not, this
pre-processing centers all the data, but bear in mind that outliers will
shift the center of mass too.
X = X - repmat(mean(X, 2), [1, size(X, 2)]);

There are no guarantees that this code will return the optimal U.
This code is distributed to illustrate one possible way of optimizing
a nonsmooth cost function over a manifold, using Manopt with smoothing.
For practical use, the constants in the code would need to be tuned.```

## CROSS-REFERENCE INFORMATION This function calls:
• grassmannfactory Returns a manifold struct to optimize over the space of vector subspaces.
• trustregions Riemannian trust-regions solver for optimization on manifolds.
• multiprod Multiplying 1-D or 2-D subarrays contained in two N-D arrays.
• multiscale Multiplies the 2D slices in a 3D matrix by individual scalars.
• multitransp Transposing arrays of matrices.
This function is called by:

## SOURCE CODE ```0001 function [U, cost] = robust_pca(X, d)
0002 % Computes a robust version of PCA (principal component analysis) on data.
0003 %
0004 % function [U, cost] = robustpca(X, d)
0005 %
0006 % Given a matrix X of size p by n, such that each column represents a
0007 % point in R^p, this computes U: an orthonormal basis of size p by d such
0008 % that the column space of U captures the points X as well as possible.
0009 % More precisely, the function attempts to compute U as the minimizer
0010 % over the Grassmann manifold (the set of linear subspaces) of:
0011 %
0012 %  f(U) = (1/n) Sum_{i = 1:n} dist(X(:, i), the space spanned by U)
0013 %       = (1/n) Sum_{i = 1:n} || U*U'*X(:, i) - X(:, i) ||
0014 %
0015 % The output cost represents the average distance achieved with the
0016 % returned U. Notice that norms are not squared, for robustness.
0017 %
0018 % In practice, because this function is nonsmooth, it is smoothed with a
0019 % pseudo-Huber loss function of parameter epsilon (noted e for short), and
0020 % the smoothing parameter is iteratively reduced (with warm starts):
0021 %
0022 %   f_e(U) = (1/n) Sum_{i = 1:n} l_e(|| U*U'*X(:, i) - X(:, i) ||)
0023 %
0024 %   with l_e(x) = sqrt(x^2 + e^2) - e (for e = 0, this is absolute value).
0025 %
0026 % The intermediate optimization of the smooth cost over the Grassmann
0027 % manifold is performed using the Manopt toolbox.
0028 %
0029 % Ideally, the non-outlier data should be centered. If not, this
0030 % pre-processing centers all the data, but bear in mind that outliers will
0031 % shift the center of mass too.
0032 % X = X - repmat(mean(X, 2), [1, size(X, 2)]);
0033 %
0034 % There are no guarantees that this code will return the optimal U.
0035 % This code is distributed to illustrate one possible way of optimizing
0036 % a nonsmooth cost function over a manifold, using Manopt with smoothing.
0037 % For practical use, the constants in the code would need to be tuned.
0038
0039 % This file is part of Manopt and is copyrighted. See the license file.
0040 %
0041 % Main author: Nicolas Boumal and Teng Zhang, May 2, 2014
0042 % Contributors:
0043 %
0044 % Change log:
0045 %
0046 %   March 4, 2015 (NB):
0047 %       Uses a pseudo-Huber loss rather than a Huber loss: this has the
0048 %       nice advantage of being smooth and simpler to code (no if's).
0049 %
0050 %   April 8, 2015 (NB):
0051 %       Built-in test data for quick tests; added comment about centering.
0052
0053
0054
0055     % If no inputs, generate random data for illustration purposes.
0056     if nargin == 0
0057         % Generate some data points aligned on a subspace
0058         X = rand(2, 1)*(1:30) + .05*randn(2, 30).*[(1:30);(1:30)];
0059         % And add some random outliers to the mix
0060         P = randperm(size(X, 2));
0061         outliers = 10;
0062         X(:, P(1:outliers)) = 30*randn(2, outliers);
0063         % Center the data
0064         % X = X - repmat(mean(X, 2), [1, size(X, 2)]);
0065         d = 1;
0066     end
0067
0068
0069
0070
0071
0072     % Prepare a Manopt problem structure for optimization of the given
0073     % cost (defined below) over the Grassmann manifold.
0074     [p, n] = size(X);
0075     manifold = grassmannfactory(p, d);
0076     problem.M = manifold;
0077     problem.cost = @robustpca_cost;
0079
0080     % Do classical PCA for the initial guess.
0081     % This is just one idea: it is not necessarily useful or ideal.
0082     % Using a random initial guess, and starting over for a few different
0083     % ones is probably much better. For this example, we keep it simple.
0084     [U, ~, ~] = svds(X, d);
0085
0086
0087     % Iteratively reduce the smoothing constant epsilon and optimize
0088     % the cost function over Grassmann.
0089     epsilon = 1;
0090     n_iterations = 6;
0091     reduction = .5;
0092     options.verbosity = 2; % Change this number for more or less output
0093     warning('off', 'manopt:getHessian:approx');
0094     for iter = 1 : n_iterations
0095         U = trustregions(problem, U, options);
0096         epsilon = epsilon * reduction;
0097     end
0098     warning('on', 'manopt:getHessian:approx');
0099
0100
0101     % Return the cost as the actual sum of distances, not smoothed.
0102     epsilon = 0;
0103     cost = robustpca_cost(U);
0104
0105
0106
0107     % If working with the auto-generated input, plot the results.
0108     if nargin == 0
0109         scatter(X(1,:), X(2,:));
0110         hold on;
0111         plot(U(1)*[-1, 1]*100, U(2)*[-1 1]*100, 'r');
0112         hold off;
0113         % Compare to a standard PCA
0114         [Upca, ~, ~] = svds(X,1);
0115         hold on;
0116         plot(Upca(1)*[-1, 1]*100, Upca(2)*[-1 1]*100, 'k');
0117         hold off;
0118         xlim(1.1*[min(X(1,:)), max(X(1,:))]);
0119         ylim(1.1*[min(X(2,:)), max(X(2,:))]);
0120         legend('data points', 'Robust PCA fit', 'Standard PCA fit');
0121     end
0122
0123
0124
0125     % Smoothed cost
0126     function value = robustpca_cost(U)
0127
0128         vecs = U*(U'*X) - X;
0129         sqnrms = sum(vecs.^2, 1);
0130         vals = sqrt(sqnrms + epsilon^2) - epsilon;
0131         value = mean(vals);
0132
0133     end
0134
0135     % Euclidean gradient of the smoothed cost (it will be transformed into
0136     % the Riemannian gradient automatically by Manopt).
0138
0139         % Note that the computation of vecs and sqnrms is redundant
0140         % with their computation in the cost function. To speed
0141         % up the code, it would be wise to use the caching capabilities
0142         % of Manopt (the store structure). See online documentation.
0143         % It is not done here to keep the code a bit simpler.
0144         UtX = U'*X;
0145         vecs = U*UtX-X;
0146         sqnrms = sum(vecs.^2, 1);
0147         % This explicit loop is a bit slow: the code below is equivalent
0148         % and faster to compute the gradient.
0149         % G = zeros(p, d);
0150         % for i=1:n
0151         %     G = G + (1/sqrt(sqnrms(i) + epsilon^2)) * vecs(:,i) * UtX(:,i)';
0152         % end
0153         % G = G/n;
0154         G = mean(multiscale(1./sqrt(sqnrms + epsilon^2), ...
0155                            multiprod(reshape(vecs, [p, 1, n]), ...
0156                               multitransp(reshape(UtX, [d, 1, n])))), 3);
0157     end
0158
0159 end```

Generated on Tue 19-May-2020 18:46:12 by m2html © 2005