Principal Component Analysis, Eigenvectors lying in the span of the observed data points?

https://datascience.stackexchange.com/questions/15886

16-10-2019
|

Pergunta

I have been reading several papers and articles related to Principal Component Analysis (PCA) and in some of them, there is one step which is quite unclear to me (in particular (3) in [Schölkopf 1996]).

Let me reproduce their reasoning below.

Consider the centered data set $D = \{\textbf{x}_k\}_{k=1}^M$ with $\textbf{x}_k \in \textbf{R}^N$ and $ \sum_{k=1}^M \textbf{x}_k = 0$. PCA diagonalizes the (sample) covariance matrix

$$ C = \frac{1}{M} \sum_{j=1}^M \textbf{x}_j \textbf{x}_j^T. \tag{1} $$

To do this we find the solution to the eigenvector equation

$$ \lambda \textbf{v} = C \textbf{v} \tag{2} $$

for eigenvalues $\lambda \geq 0$ and eigenvectors $\textbf{v} \in \textbf{R}^N\backslash \{{0}\}$. As

$$ \lambda \textbf{v} = C \textbf{v} = \frac{1}{M} \sum_{j=1}^M (\textbf{x}_j^T \textbf{v}) \textbf{x}_j, \tag{3} $$

all solutions $\textbf{v}$ with $\lambda \neq 0$ must lie in the span of $\textbf{x}_1, \dots, \textbf{x}_M$, hence (2) is equivalent to

$$ \lambda(\textbf{x}_k^T \textbf{v}) = \textbf{x}_k^T C \textbf{v}, \qquad \text{for } k = 1, \dots, M \tag{4} $$

In (4), doesn't $\lambda(\textbf{x}^T \textbf{v}) = \textbf{x}^T C \textbf{v}$ hold for $\textbf{any}$ value of $\textbf{x}$? Why does (4) only hold when $\textbf{x} \in D$? I do not understand how their end up with (4).

Thanks.

Solução

The statement says, that (2) and (4) are equal. That means (2)$\Rightarrow$(4) and (4)$\Rightarrow$(2). The first implication is trivial, as you correctly pointed out. $$\lambda v=Cv$$ implies $$\lambda x^Tv=x^TCv$$ for all $v$, not just those from $D$. But the second implication is a bit more tricky, but that is what the proof is about. It tells you, that if you want to check, whether a vector is an eigenvector of $C$, you don't have to check if (2) is satisfied. It tells you, that when (4) is satisfied, (2) is satisfies as well.

Imagine you have 2 points in 3D space. Those 2 points are $$x_1=(-1,-1,0)$$ $$x_2=(1,1,0)$$ (please excuse me for not making this "dataset" centered) Note, that both points lie in the $xy$ plane. Now the correlation matrix is $$C=\begin{bmatrix} 1/2 & -1/2 & 0 \\[0.3em] -1/2 & 1/2 & 0 \\[0.3em] 0 & 0 & 0 \end{bmatrix}$$ Now you want to know, whether $v=[1\ -1\ 0]^T$ is an eigenvector with eigenvalue 1. The statement tells you, that you can either just check, if (2) is satisfied (3 equations) or if $$\frac{1}{2}x_1^Tv=x_1^TCv=0$$ and $$\frac{1}{2}x_2^Tv=x_2^TCv=0$$ which are only two equations.

Licenciado em: CC-BY-SA com atribuição

Não afiliado a datascience.stackexchange