Question

For example, I have 9 variables and 362 cases. I've made PCA calculation, and found out that first 3 PCA coordinates are enough for me.

Now, I have new point in my 9-dimensional structure, and I want to project it to principal component system coordinate. How to get its new coordinates?

%# here is data (362x9)
load SomeData

[W, Y] = pca(data, 'VariableWeights', 'variance', 'Centered', true);

%# orthonormal coefficient matrix
W = diag(std(data))\W;

% Getting mean and weights of data (for future data)
[data, mu, sigma] = zscore(data);
sigma(sigma==0) = 1;

%# New point in original 9dim system
%# For example, it is the first point of our input data
x = data(1,:);
x = bsxfun(@minus,x, mu);
x = bsxfun(@rdivide, x, sigma);

%# New coordinates as principal components
y0 = Y(1,:); %# point we should get in result
y = (W*x')'; %# our result

%# error
sum(abs(y0 - y)) %# 142 => they are not the same point

%# plot
figure()
plot(y0,'g'); hold on;
plot(y,'r');

enter image description here

How to get coordinates of a new point projected to new principal component basis?

Was it helpful?

Solution

Main fallacy was in operation that converts points to new basis:

y = (W*x')';

Wikipedia says:

The projected vectors are the columns of the matrix

Y = W*·Z, 

where Y is L×N, W is M×L, Z is M×N,

but pca() returns W of size L×M and Y of size NxL

so, correct equation in Matlab is:

y = x*W

Below is the corrected code:

[W, Y] = pca(data, 'VariableWeights', 'variance', 'Centered', true);
W = diag(std(data))\W;

%# Getting mean and weights of data (for future data)
[~, mu, we] = zscore(data);
we(we==0) = 1;

%# New point in original 9dim system
%# For example, it is the first point of our input data
x = data(1,:); 
x = bsxfun(@minus,x, mu);
x = bsxfun(@rdivide, x, we);

%# New coordinates as principal components
y = x*W;
y0 = Y(1,:);
sum(abs(y0 - y)) %# 4.1883e-14 ~= 0
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top