Home > examples > robust_pca.m

# robust_pca

## PURPOSE

Computes a robust version of PCA (principal component analysis) on data.

## SYNOPSIS

function [U, cost] = robust_pca(X, d)

## DESCRIPTION

``` Computes a robust version of PCA (principal component analysis) on data.

function [U, cost] = robustpca(X, d)

Given a matrix X of size p by n, such that each column represents a
point in R^p, this computes U: an orthonormal basis of size p by d such
that the column space of U captures the points X as well as possible.
More precisely, the function attempts to compute U as the minimizer
over the Grassmann manifold (the set of linear subspaces) of:

f(U) = (1/n) Sum_{i = 1:n} dist(X(:, i), the space spanned by U)
= (1/n) Sum_{i = 1:n} || U*U'*X(:, i) - X(:, i) ||

The output cost represents the average distance achieved with the
returned U. Notice that norms are not squared, for robustness.

In practice, because this function is nonsmooth, it is smoothed with a
pseudo-Huber loss function of parameter epsilon (noted e for short), and
the smoothing parameter is iteratively reduced (with warm starts):

f_e(U) = (1/n) Sum_{i = 1:n} l_e(|| U*U'*X(:, i) - X(:, i) ||)

with l_e(x) = sqrt(x^2 + e^2) - e (for e = 0, this is absolute value).

The intermediate optimization of the smooth cost over the Grassmann
manifold is performed using the Manopt toolbox.

Ideally, the non-outlier data should be centered. If not, this
pre-processing centers all the data, but bear in mind that outliers will
shift the center of mass too.
X = X - repmat(mean(X, 2), [1, size(X, 2)]);

There are no guarantees that this code will return the optimal U.
This code is distributed to illustrate one possible way of optimizing
a nonsmooth cost function over a manifold, using Manopt with smoothing.
For practical use, the constants in the code would need to be tuned.```

## CROSS-REFERENCE INFORMATION

This function calls:
• grassmannfactory Returns a manifold struct to optimize over the space of vector subspaces.
• trustregions Riemannian trust-regions solver for optimization on manifolds.
• multiprod Matrix multiply 2-D slices of N-D arrays
• multiscale Multiplies the 2D slices in a 3D matrix by individual scalars.
• multitransp Transpose the matrix slices of an N-D array (no complex conjugate)
This function is called by:

## SOURCE CODE

```0001 function [U, cost] = robust_pca(X, d)
0002 % Computes a robust version of PCA (principal component analysis) on data.
0003 %
0004 % function [U, cost] = robustpca(X, d)
0005 %
0006 % Given a matrix X of size p by n, such that each column represents a
0007 % point in R^p, this computes U: an orthonormal basis of size p by d such
0008 % that the column space of U captures the points X as well as possible.
0009 % More precisely, the function attempts to compute U as the minimizer
0010 % over the Grassmann manifold (the set of linear subspaces) of:
0011 %
0012 %  f(U) = (1/n) Sum_{i = 1:n} dist(X(:, i), the space spanned by U)
0013 %       = (1/n) Sum_{i = 1:n} || U*U'*X(:, i) - X(:, i) ||
0014 %
0015 % The output cost represents the average distance achieved with the
0016 % returned U. Notice that norms are not squared, for robustness.
0017 %
0018 % In practice, because this function is nonsmooth, it is smoothed with a
0019 % pseudo-Huber loss function of parameter epsilon (noted e for short), and
0020 % the smoothing parameter is iteratively reduced (with warm starts):
0021 %
0022 %   f_e(U) = (1/n) Sum_{i = 1:n} l_e(|| U*U'*X(:, i) - X(:, i) ||)
0023 %
0024 %   with l_e(x) = sqrt(x^2 + e^2) - e (for e = 0, this is absolute value).
0025 %
0026 % The intermediate optimization of the smooth cost over the Grassmann
0027 % manifold is performed using the Manopt toolbox.
0028 %
0029 % Ideally, the non-outlier data should be centered. If not, this
0030 % pre-processing centers all the data, but bear in mind that outliers will
0031 % shift the center of mass too.
0032 % X = X - repmat(mean(X, 2), [1, size(X, 2)]);
0033 %
0034 % There are no guarantees that this code will return the optimal U.
0035 % This code is distributed to illustrate one possible way of optimizing
0036 % a nonsmooth cost function over a manifold, using Manopt with smoothing.
0037 % For practical use, the constants in the code would need to be tuned.
0038
0039 % This file is part of Manopt and is copyrighted. See the license file.
0040 %
0041 % Main author: Nicolas Boumal and Teng Zhang, May 2, 2014
0042 % Contributors:
0043 %
0044 % Change log:
0045 %
0046 %   March 4, 2015 (NB):
0047 %       Uses a pseudo-Huber loss rather than a Huber loss: this has the
0048 %       nice advantage of being smooth and simpler to code (no if's).
0049 %
0050 %   April 8, 2015 (NB):
0051 %       Built-in test data for quick tests; added comment about centering.
0052 %
0053 %   Aug  20, 2021 (XJ):
0054 %       Added AD to compute the egrad
0055
0056
0057     % If no inputs, generate random data for illustration purposes.
0058     if nargin == 0
0059         % Generate some data points aligned on a subspace
0060         X = rand(2, 1)*(1:30) + .05*randn(2, 30).*[(1:30);(1:30)];
0061         % And add some random outliers to the mix
0062         P = randperm(size(X, 2));
0063         outliers = 10;
0064         X(:, P(1:outliers)) = 30*randn(2, outliers);
0065         % Center the data
0066         % X = X - repmat(mean(X, 2), [1, size(X, 2)]);
0067         d = 1;
0068     end
0069
0070
0071
0072
0073
0074     % Prepare a Manopt problem structure for optimization of the given
0075     % cost (defined below) over the Grassmann manifold.
0076     [p, n] = size(X);
0077     manifold = grassmannfactory(p, d);
0078     problem.M = manifold;
0079     problem.cost = @robustpca_cost;
0080     problem.egrad = @robustpca_gradient;
0081
0082     % Do classical PCA for the initial guess.
0083     % This is just one idea: it is not necessarily useful or ideal.
0084     % Using a random initial guess, and starting over for a few different
0085     % ones is probably much better. For this example, we keep it simple.
0086     [U, ~, ~] = svds(X, d);
0087
0088
0089     % Iteratively reduce the smoothing constant epsilon and optimize
0090     % the cost function over Grassmann.
0091     epsilon = 1;
0092     n_iterations = 6;
0093     reduction = .5;
0094     options.verbosity = 2; % Change this number for more or less output
0095     warning('off', 'manopt:getHessian:approx');
0096
0097     % An alternative way to compute the egrad is to use automatic
0098     % differentiation provided in the deep learning toolbox (slower).
0099     % Call manoptAD to automatically obtain the egrad
0100     % problem = manoptAD(problem,'egrad');
0101
0102     for iter = 1 : n_iterations
0103         U = trustregions(problem, U, options);
0104         epsilon = epsilon * reduction;
0105     end
0106     warning('on', 'manopt:getHessian:approx');
0107
0108
0109     % Return the cost as the actual sum of distances, not smoothed.
0110     epsilon = 0;
0111     cost = robustpca_cost(U);
0112
0113
0114
0115     % If working with the auto-generated input, plot the results.
0116     if nargin == 0
0117         figure;
0118         scatter(X(1,:), X(2,:));
0119         hold on;
0120         plot(U(1)*[-1, 1]*100, U(2)*[-1 1]*100, 'r');
0121         hold off;
0122         % Compare to a standard PCA
0123         [Upca, ~, ~] = svds(X,1);
0124         hold on;
0125         plot(Upca(1)*[-1, 1]*100, Upca(2)*[-1 1]*100, 'k');
0126         hold off;
0127         xlim(1.1*[min(X(1,:)), max(X(1,:))]);
0128         ylim(1.1*[min(X(2,:)), max(X(2,:))]);
0129         legend('data points', 'Robust PCA fit', 'Standard PCA fit');
0130     end
0131
0132
0133
0134     % Smoothed cost
0135     function value = robustpca_cost(U)
0136
0137         vecs = U*(U'*X) - X;
0138         sqnrms = sum(vecs.^2, 1);
0139         vals = sqrt(sqnrms + epsilon^2) - epsilon;
0140         value = mean(vals);
0141
0142     end
0143
0144     % Euclidean gradient of the smoothed cost (it will be transformed into
0145     % the Riemannian gradient automatically by Manopt).
0146     function G = robustpca_gradient(U)
0147
0148         % Note that the computation of vecs and sqnrms is redundant
0149         % with their computation in the cost function. To speed
0150         % up the code, it would be wise to use the caching capabilities
0151         % of Manopt (the store structure). See online documentation.
0152         % It is not done here to keep the code a bit simpler.
0153         UtX = U'*X;
0154         vecs = U*UtX-X;
0155         sqnrms = sum(vecs.^2, 1);
0156         % This explicit loop is a bit slow: the code below is equivalent
0157         % and faster to compute the gradient.
0158         % G = zeros(p, d);
0159         % for i=1:n
0160         %     G = G + (1/sqrt(sqnrms(i) + epsilon^2)) * vecs(:,i) * UtX(:,i)';
0161         % end
0162         % G = G/n;
0163         G = mean(multiscale(1./sqrt(sqnrms + epsilon^2), ...
0164                            multiprod(reshape(vecs, [p, 1, n]), ...
0165                               multitransp(reshape(UtX, [d, 1, n])))), 3);
0166     end
0167
0168 end```

Generated on Fri 30-Sep-2022 13:18:25 by m2html © 2005