principal component variance given by eigenvalue for principal eigenvector

https://stackoverflow.com/questions/11920264

25-06-2021
|

Question

in Principle Component Analysis

I was wondering why the data projected onto the principle component has variance by the eigenvalue corresponding to the principle eigenvector?

I can't find the explanation in my textbook.

Solution

In Principal Components Analysis (PCA), you are calculating a rotation of the original coordinate system such that all non-diagonal elements of the new covariance matrix become zero (i.e., the new coordinates are uncorrelated). The eigenvectors define the directions of the new coordinate axes and the eigenvalues correspond to the diagonal elements of the new covariance matrix (the variance along the new axes). So the eigenvalues, by definition, define the variance along the corresponding eigenvectors.

Note that if you were to multiply all your original data values by some constant (with value greater than one), that would have the effect of increasing the variance (and covariance) of the data. If you then perform PCA on the modified data, the eigenvectors you compute would be the same (you still need to same rotation to uncorrelate your coordinates) but the eigenvalues would increase because the variance of the data along the new coordinate axes will have increased.

OTHER TIPS

Good question. Please read CMU's 36350 lecture notes. In short, the way the PCA optimization problem is framed leads to a Lagrangian constraint optimization eigenproblem (pg. 2-5) that is solved by taking the eigenvectors of the sample covariance matrix.

What you're doing in principle component analysis is "diagonalizing the covariance matrix", and that in the coordinate basis that diagonalizes the covariance, you can just read off the variance of each component.

To really understand it requires learning the linear algebra that underlies the eigenvalue problem; things like "the eigenvalues of a Hermitian matrix are invariant under orthogonal transformations" and so on, but something you could try is:

Generate some x-values as zero-mean Gaussians with variance sigma_x2
Generate independent y values as zero-mean Gaussians with variance sigma_y2<sigma_x2.
Visualize this as a 2-dimensional data set -- Note that it's been constructed so that the correlation matrix is diagonal, and the variance of the data in each direction (x,y) is the corresponding element of the covariance matrix. Also note that the two eigenvalues of this matrix are sigma_x2,sigma_x1 and the eigenvectors are [1,0] and [0,1].
Now construct a correlated data set by simply rotating the whole picture. Mathematically, pick an orthogonal matrix O, and generate a rotated version of each [x,y] sample. You'll find that the correlation matrix of this transformed data set has off-diagonal elements, i.e. a correlation between x and y. But if you do the eigenvalue decomposition, the eigenvectors are just the columns of the orthogonal matrix used to rotate the data in the first place, and the eigenvalues are the original eigenvalues.

Principle components analysis, i.e. the eigenvalue decomposition of the covariance matrix, is running this process in reverse: starting with the correlated data set, and then deriving the coordinate basis that diagonalizes the covariance matrix.

Getting your head around it will probably take both learning the formal mathematics and some experience, maybe trying it out (and visualizing it) on 2 or 3 dimensional problems will help you to get a feel for it.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow