Question

I am trying to understand principal component analysis in Matlab,

There seems to be at least 3 different functions that do it.

I have some questions re the code below:

  1. Am I creating approximate x values using only one eigenvector (the one corresponding to the largest eigenvalue) correctly? I think so??

  2. Why are PC and V which are both meant to be the loadings for (x'x) presented differently? The column order is reversed because eig does not order the eigenvalues with the largest value first but why are they the negative of each other?

  3. Why are the eig values not in ordered with the eigenvector corresponding to the largest eigenvalue in the first column?

  4. Using the code below I get back to the input matrix x when using svd and eig, but the results from princomp seem to be totally different? What so I have to do to make princomp match the other two functions?

Code:

x=[1 2;3 4;5 6;7 8 ]

econFlag=0;

[U,sigma,V] = svd(x,econFlag);%[U,sigma,coeff] = svd(z,econFlag);

U1=U(:,1);
V1=V(:,1);
sigma_partial=sigma(1,1);

score1=U*sigma;
test1=score1*V';

score_partial=U1*sigma_partial;
test1_partial=score_partial*V1';



[PC, D] = eig(x'*x)

 score2=x*PC;
test2=score2*PC';

PC1=PC(:,2);
score2_partial=x*PC1;
 test2_partial=score2_partial*PC1';

[o1 o2 o3]=princomp(x);
Was it helpful?

Solution

  1. Yes. According to the documentation of svd, diagonal elements of the output S are in decreasing order. There is no such guarantee for the the output D of eig though.

  2. Eigenvectors and singular vectors have no defined sign. If a is an eigenvector, so is -a.

  3. I've often wondered the same. Laziness on the part of TMW? Optimization, because sorting would be an additional step and not everybody needs 'em sorted?

  4. princomp centers the input data before computing the principal components. This makes sense as normally the PCA is computed with respect to the covariance matrix, and the eigenvectors of x' * x are only identical to those of the covariance matrix if x is mean-free.


I would compute the PCA by transforming to the basis of the eigenvectors of the covariance matrix (centered data), but apply this transform to the original (uncentered) data. This allows to capture a maximum of variance with as few principal components as possible, but still to recover the orginal data from all of them:

[V, D] = eig(cov(x));

score = x * V;
test = score * V';

test is identical to x, up to numerical error.

In order to easily pick the components with the most variance, let's fix that lack of sorting ourselves:

[V, D] = eig(cov(x));
[D, ind] = sort(diag(D), 'descend');
V = V(:, ind);

score = x * V;
test = score * V';

Reconstruct the signal using the strongest principal component only:

test_partial = score(:, 1) * V(:, 1)';

In response to Amro's comments: It is of course also possible to first remove the means from the input data, and transform these "centered" data. In that case, for perfect reconstruction of the original data it would be necessary to add the means again. The way to compute the PCA given above is the one described by Neil H. Timm, Applied Multivariate Analysis, Springer 2002, page 446:

Given an observation vector Y with mean mu and covariance matrix Sigma of full rank p, the goal of PCA is to create a new set of variables called principal components (PCs) or principal variates. The principal components are linear combinations of the variables of the vector Y that are uncorrelated such that the variance of the jth component is maximal.

Timm later defines "standardized components" as those which have been computed from centered data and are then divided by the square root of the eigenvalues (i.e. variances), i.e. "standardized principal components" have mean 0 and variance 1.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top