Home > examples > shapefit_smoothed.m

# shapefit_smoothed

## PURPOSE

ShapeFit formulation for sensor network localization from pair directions

## SYNOPSIS

function [T_hub, T_lsq, T_cvx] = shapefit_smoothed(V, J)

## DESCRIPTION

 ShapeFit formulation for sensor network localization from pair directions

function [T_hub, T_lsq, T_cvx] = shapefit_smoothed(V, J)

This example in based on the paper http://arxiv.org/abs/1506.01437:
ShapeFit: Exact location recovery from corrupted pairwise directions, 2015
by Paul Hand, Choongbum Lee and Vladislav Voroninski.

The problem is the following: there are n points t_1, ..., t_n in R^d
which need to be estimated (localized). To this end, we are given
measurements of some of the pairwise directions,
v_ij = (t_i - t_j) / norm(t_i - t_j) + noise.
Assume there are m such pairwise measurements, defining a graph with m
edges over n nodes. J is the signed incidence matrix of this graph (see
in code). To build J from lists I, J in R^m of nodes, use:
J = sparse([I ; J], [(1:m)' ; (1:m)'], [ones(m, 1), -ones(m, 1)], n, m, 2*m);

The measurements are arranged in the matrix V of size d x m. From V, we
attempt to estimate t_1, ..., t_n, arranged in T, a matrix of size d x n.
The estimation can only be done up to translation and scaling. The
returned T's are centered: the columns sum to zero.

ShapeFit is a formulation of this estimation problem which is robust to
outliers. It is a nonsmooth, convex optimization problem over an affine
space, i.e., a linear manifold. We smooth the cost using the pseudo-Huber
loss cost and solve the problem using Manopt. This requires two
ingredients: (1) a factory to describe the affine space, see
shapefitfactory; (2) defining the cost and its derivative, and minimizing
it while progressively tightening the smooth approximation (with
warm-start).

Simply run the example to see the results on random data. It compares the
smoothed ShapeFit formulation against a least-squares formulation, when
the measurements include outliers. See in code to vary the noise
parameters, dimension d, number of nodes n, number of measurements m, ...

Note: since the problem is convex, this returns the global optimum.
This example also illustrates the use of Manopt for optimization under
linear constraints: admittedly a simple subcase of optimization on
manifolds.

See also: shapefitfactory

## CROSS-REFERENCE INFORMATION

This function calls:
• shapefitfactory Linear manifold structure for optimization over the ShapeFit search space
• full FULL Convert TTeMPS tensor to full array
• minus MINUS Substraction of two TT/MPS tensors.
• norm NORM Norm of a TT/MPS tensor.
• full FULL Convert TTeMPS tensor to full array
• minus MINUS Substraction of two TT/MPS block-mu tensors.
• norm NORM Norm of a TT/MPS block-mu tensor.
• full FULL Convert TTeMPS_op operator to full array
• trustregions Riemannian trust-regions solver for optimization on manifolds.
This function is called by:

## SOURCE CODE

0001 function [T_hub, T_lsq, T_cvx] = shapefit_smoothed(V, J)
0002 % ShapeFit formulation for sensor network localization from pair directions
0003 %
0004 % function [T_hub, T_lsq, T_cvx] = shapefit_smoothed(V, J)
0005 %
0006 % This example in based on the paper http://arxiv.org/abs/1506.01437:
0007 % ShapeFit: Exact location recovery from corrupted pairwise directions, 2015
0008 % by Paul Hand, Choongbum Lee and Vladislav Voroninski.
0009 %
0010 % The problem is the following: there are n points t_1, ..., t_n in R^d
0011 % which need to be estimated (localized). To this end, we are given
0012 % measurements of some of the pairwise directions,
0013 % v_ij = (t_i - t_j) / norm(t_i - t_j) + noise.
0014 % Assume there are m such pairwise measurements, defining a graph with m
0015 % edges over n nodes. J is the signed incidence matrix of this graph (see
0016 % in code). To build J from lists I, J in R^m of nodes, use:
0017 % J = sparse([I ; J], [(1:m)' ; (1:m)'], [ones(m, 1), -ones(m, 1)], n, m, 2*m);
0018 %
0019 % The measurements are arranged in the matrix V of size d x m. From V, we
0020 % attempt to estimate t_1, ..., t_n, arranged in T, a matrix of size d x n.
0021 % The estimation can only be done up to translation and scaling. The
0022 % returned T's are centered: the columns sum to zero.
0023 %
0024 % ShapeFit is a formulation of this estimation problem which is robust to
0025 % outliers. It is a nonsmooth, convex optimization problem over an affine
0026 % space, i.e., a linear manifold. We smooth the cost using the pseudo-Huber
0027 % loss cost and solve the problem using Manopt. This requires two
0028 % ingredients: (1) a factory to describe the affine space, see
0029 % shapefitfactory; (2) defining the cost and its derivative, and minimizing
0030 % it while progressively tightening the smooth approximation (with
0031 % warm-start).
0032 %
0033 % Simply run the example to see the results on random data. It compares the
0034 % smoothed ShapeFit formulation against a least-squares formulation, when
0035 % the measurements include outliers. See in code to vary the noise
0036 % parameters, dimension d, number of nodes n, number of measurements m, ...
0037 %
0038 % Note: since the problem is convex, this returns the global optimum.
0039 % This example also illustrates the use of Manopt for optimization under
0040 % linear constraints: admittedly a simple subcase of optimization on
0041 % manifolds.
0042 %
0043 %
0045
0046 % This file is part of Manopt: www.manopt.org.
0047 % Original author: Nicolas Boumal, June 18, 2015.
0048 % Contributors:
0049 % Change log:
0050 %
0051 %   Jan.  4, 2021 (NB):
0052 %       Changes for compatibility with Octave 6.1.0.
0053 %
0054 %   Aug. 23, 2021 (XJ):
0056
0057     % Generic useful functions
0058     center_cols = @(A) bsxfun(@minus, A, mean(A, 2));
0059     normalize_cols = @(A) bsxfun(@times, A, 1./sqrt(sum(A.^2, 1)));
0060     sqnorm_cols = @(A) sum(A.^2, 1);
0061
0062
0063     % DATA GENERATION
0064     %
0065     % If no inputs are specified, generate some random data for
0066     % illustration purposes.
0067     if nargin == 0
0068
0069         % We estimate n points in R^d
0070         d =   2;
0071         n = 500;
0072
0073         % Those points are the columns of T : they are what we need to
0074         % estimate, up to scaling and translation. We center T for
0075         % convenience.
0076         T_tru = center_cols(rand(d, n));
0077
0078         % We get a measurement of some pairs of relative directions.
0079         % Which pairs is encoded in this graph, with J being the (signed,
0080         % transposed) incidence matrix. J is n x m, sparse.
0081         % There are roughly edge_fraction * n * (n-1) / 2 measurements.
0082         edge_fraction = 0.10;
0083         % [ii, jj] = erdosrenyi(n, edge_fraction);
0084         [ii, jj] = randomgraph(n, edge_fraction*nchoosek(n, 2));
0085         m = length(ii);
0086         J = sparse([ii ; jj], [(1:m)' ; (1:m)'], ...
0087                    [ones(m, 1), -ones(m, 1)], n, m, 2*m);
0088
0089         % The measurements give the directions from one point to another.
0090         % That is: we get the position difference, normalized. Here, with
0091         % Gaussian noise. Least-squares will be well-suited for this.
0092         sigma = .0;
0093         V = normalize_cols(T_tru*J + sigma*randn(d, m)); % d x m
0094
0095         % Outliers: we replace some of the direction measurements by
0096         % uniformly random unit-norm vectors.
0097         outlier_fraction = .3;
0098         outliers = rand(1, m) < outlier_fraction;
0099         V(:, outliers) = normalize_cols(randn(d, sum(outliers)));
0100
0101     end % done generating random data
0102
0103
0104
0105
0106
0107     [d, m] = size(V);
0108     n = size(J, 1);
0109     assert(size(J, 2) == m, 'J must be n x m, with V of size d x m.');
0110
0111     VJt = full(V*J');
0112
0113     % This "manifold" describes the Euclidean space of matrices T of size
0114     % d x n such that <VJt, T> = 1 and T has centered columns: T1 = 0.
0115     problem.M = shapefitfactory(VJt);
0116
0117     % This linear operator computes the orthogonal projection of each
0118     % difference ti - tj on the orthogonal space to v_ij.
0119     % If the alignment is compatible with the data, then this is zero.
0120     % A(T) is a d x m matrix.
0121     A = @(T) Afun(T, V, J);
0122     function AT = Afun(T, V, J)
0123         TJ = T*J;
0124         AT = TJ - bsxfun(@times, V, sum(V .* TJ, 1));
0125     end
0126
0127     % Need the adjoint of A, too. Input is d x m, output is d x n.
0128     Astar = @(W) (W - bsxfun(@times, V, sum(V.*W, 1)))*J';
0129
0130
0131
0132     % LEAST-SQUARES
0133     %
0134     % First, work with a least-squares formulation of the problem.
0135     % That is, we minimize a (very nice) convex cost over an affine space.
0136     % Since the smooth solvers in Manopt converge to critical points, this
0137     % means they converge to global optimizers.
0138     problem.cost  = @(T) 0.5*norm(A(T), 'fro')^2;
0140     problem.ehess = @(T, Tdot) Astar(A(Tdot));
0141
0142     % An alternative way to compute the egrad and the ehess is to use
0143     % automatic differentiation provided in the deep learning toolbox (slower)
0144     % Notice that the function norm is not supported for AD so far.
0145     % Replace norm(...,'fro')^2 with cnormsqfro described in the file
0146     % manoptADhelp.m. Also operations between sparse matrices and dlarrys
0147     % is not supported so far. Transform V,J into full matrices. AD does
0148     % not support bsxfunc as well. Translate it into the expression of
0149     % repmat and .*. The whole thing would make the computation much slower
0150     % than specifying the egrad and the ehess.
0151     % V_full = full(V);
0152     % J_full = full(J);
0153     % problem.cost  = @(T) 0.5*cnormsqfro(A_AD(T));
0154     % function AT = A_AD(T)
0155     %    sum1 = sum(V_full .* (T*J_full), 1);
0156     %    repsum1 = repmat(sum1,size(sum1,1),1);
0157     %    AT = T*J_full - V_full.*repsum1;
0158     % end
0161
0162     T_lsq = trustregions(problem);
0163
0164
0165
0166     % PSEUDO-HUBER SMOOTHED SHAPEFIT
0167     %
0168     % Now solve the same, but with a pseudo-Huber loss instead of
0169     % least-squares.
0170     % We iteratively sharpen the Huber function, i.e., reduce delta.
0171     % It is important to warm start in such a fashion: trying to optimize
0172     % with a random initial guess and a very small delta is typically slow.
0173     % How fast one should decrease delta, and how accurately one should
0174     % optimize at each intermediate stage, is open for research.
0175     delta = 1;
0176     T_hub = []; % We could use T_lsq as initial guess, too.
0177     problem = rmfield(problem, 'ehess');
0178     warning('off', 'manopt:getHessian:approx');
0179     for iter = 1 : 12
0180
0181         delta = delta / 2;
0182
0183         h = @(x2) sqrt(x2 + delta^2) - delta; % pseudo-Huber loss
0184
0185         problem.cost  = @(T) sum(h(sqnorm_cols(A(T))));
0186         problem.egrad = @(T) Astar(bsxfun(@times, A(T), ...
0187                                     1./sqrt(sqnorm_cols(A(T)) + delta^2)));
0188
0190         % problem = rmfield(problem, 'egrad');
0191         % problem = rmfield(problem, 'autogradfunc');
0193
0194         % Solve, using the previous solution as initial guess.
0195         T_hub = trustregions(problem, T_hub);
0196
0197     end
0198
0199
0200
0201     % CVX SHAPEFIT
0202     %
0203     % Actual ShapeFit cost (nonsmooth), with CVX.
0204     % You can get CVX from http://cvxr.com/.
0205     use_cvx_if_available = false;
0206     if use_cvx_if_available && exist('cvx_version', 'file')
0207         T_cvx = shapefit_cvx(V, J);
0208     else
0209         T_cvx = NaN(d, n);
0210     end
0211
0212
0213
0214     % VISUALIZATION
0215     %
0216     % If T_true is available, for display, we scale the estimators to match
0217     % the norm of the target. The scaling factor is obtained by minimizing
0218     % the norm of the discrepancy : norm(T_tru - scale*T_xxx, 'fro').
0219     % A plot is produced if d is 2 or 3.
0220     if exist('T_tru', 'var') && (d == 2 || d == 3)
0221
0222         T_lsq = T_lsq * trace(T_tru'*T_lsq) / norm(T_lsq, 'fro')^2;
0223         T_hub = T_hub * trace(T_tru'*T_hub) / norm(T_hub, 'fro')^2;
0224         T_cvx = T_cvx * trace(T_tru'*T_cvx) / norm(T_cvx, 'fro')^2;
0225
0226
0227         switch d
0228             case 2
0229                 plot(T_tru(1, :), T_tru(2, :), 'bo', ...
0230                      T_lsq(1, :), T_lsq(2, :), 'rx', ...
0231                      T_hub(1, :), T_hub(2, :), 'k.', ...
0232                      T_cvx(1, :), T_cvx(2, :), 'g.');
0233             case 3
0234                 plot3(T_tru(1, :), T_tru(2, :), T_tru(3, :), 'bo', ...
0235                       T_lsq(1, :), T_lsq(2, :), T_lsq(3, :), 'rx', ...
0236                       T_hub(1, :), T_hub(2, :), T_hub(3, :), 'k.', ...
0237                       T_cvx(1, :), T_cvx(2, :), T_cvx(3, :), 'g.');
0238         end
0239
0240         legend('ground truth', 'least squares', ...
0241                sprintf('pseudo-huber, \\delta = %.1e', delta), ...
0242                'CVX ShapeFit');
0243
0244         title(sprintf(['ShapeFit problem : d = %d, n = %d, edge ' ...
0245                        'fraction = %.2g, sigma = %.2g, outlier ' ...
0246                        'fraction = %.2g'], d, n, edge_fraction, sigma, ...
0247                        outlier_fraction));
0248         axis equal;
0249
0250     end
0251
0252 end
0253
0254
0255 % If CVX is available, it can be used to solve the nonsmooth problem
0256 % directly, very elegantly.
0257 function T_cvx = shapefit_cvx(V, J)
0258     d = size(V, 1);
0259     n = size(J, 1); %#ok<NASGU>
0260     VJt = full(V*J');
0261     cvx_begin
0262         variable T_cvx(d, n)
0263         % We want to minimize this:
0264         % minimize sum( norms( A(T_cvx), 2, 1 ) )
0265         % But unfortunately, CVX doesn't handle bsxfun. Instead, we use
0266         % repmat, which is slower, and hence hurts the comparison in
0267         % disfavor of CVX.
0268         minimize sum( norms( T_cvx*J - V .* repmat(sum(V .* (T_cvx*J), 1), [d, 1])  , 2, 1 ) )
0269         sum(T_cvx, 2) == zeros(d, 1); %#ok<NODEF,EQEFF>
0270         VJt(:).' * T_cvx(:) == 1; %#ok<EQEFF>
0271     cvx_end
0272 end
0273
0274
0275 function [I, J, A] = erdosrenyi(n, p) %#ok<DEFNU>
0276 % Generate a random Erdos-Renyi graph with n nodes and edge probability p.
0277 %
0278 % [I, J, A] = erdosrenyi(n, p)
0279 %
0280 % Returns a list of edges (I(k), J(k)) for a random, undirected Erdos-Renyi
0281 % graph with n nodes and edge probability p. A is the adjacency matrix.
0282 %
0283 % I(k) < J(k) for all k, i.e., all(I<J) is true.
0284 %
0285 % The memory requirements for this simple implementation scale as O(n^2).
0286
0287     X = rand(n);
0288     mask = X <= p;
0291     X = triu(X, 1);
0292
0293     % A is the adjacency matrix
0294     A = X + X';
0295
0296     [I, J] = find(X);
0297
0298 end
0299
0300
0301 function [I, J, A] = randomgraph(n, m)
0302 % Generates a random graph over n nodes with at most m edges.
0303 %
0304 % function [I, J, A] = randomgraph(n, m)
0305 %
0306 % Selects m (undirected) edges from a graph over n nodes, uniformly at
0307 % random, with replacement. The self edges and repeated edges are then
0308 % discarded. The remaining number of edges is at most m, and should be
0309 % close to m if m is much smaller than nchoosek(n, 2).
0310 %
0311 % The output satisfies all(I < J). A is the corresponding adjacency matrix.
0312 %
0313 % Uses O(m) memory (not O(n^2)), making it fit for large, sparse graphs.
0314
0315     % Generate m edges at random, with replacement, and remove repetitions.
0316     IJ = unique(sort(randi(n, m, 2), 2), 'rows');
0317
0318     % Remove self-edges if any.
0319     IJ = IJ(IJ(:, 1) ~= IJ(:, 2), :);
0320
0321     % Actual number of selected edges
0322     k = size(IJ, 1);
0323
0324     I = IJ(:, 1);
0325     J = IJ(:, 2);
0326
0327     % Build the adjacency matrix of the graph.
0328     A = sparse([I ; J], [J ; I], ones(2*k, 1), n, n, 2*k);
0329
0330 end

Generated on Fri 30-Sep-2022 13:18:25 by m2html © 2005