PCA for complex-valued data
Question
I'm quite shocked for encountering this error on PCA
from sklearn
ValueError: Complex data not supported
After trying to fit complex-valued data. Is this just unimplemented thing? Should I just go ahead and do it 'manually' with SVD or is their a catch for complex-values?
Solution
Apparently this functionality is left out intentionally, see here. I'm afraid you have to use SVD, but that should be fairly straightforward:
def pca(X):
mean = X.mean(axis=0)
center = X - mean
_, stds, pcs = np.linalg.svd(center/np.sqrt(X.shape[0]))
return stds**2, pcs
OTHER TIPS
My implementation exactly mimicks the original PCA so any existing code that deals with PCA would work seamlessly.
class ComplexPCA:
def __init__(self, n_components):
self.n_components = n_components
self.u = self.s = self.components_ = None
self.mean_ = None
@property
def explained_variance_ratio_(self):
return self.s
def fit(self, matrix, use_gpu=False):
self.mean_ = matrix.mean(axis=0)
if use_gpu:
import tensorflow as tf # torch doesn't handle complex values.
tensor = tf.convert_to_tensor(matrix)
u, s, vh = tf.linalg.svd(tensor, full_matrices=False) # full=False ==> num_pc = min(N, M)
# It would be faster if the SVD was truncated to only n_components instead of min(M, N)
else:
_, self.s, vh = np.linalg.svd(matrix, full_matrices=False) # full=False ==> num_pc = min(N, M)
# It would be faster if the SVD was truncated to only n_components instead of min(M, N)
self.components_ = vh # already conjugated.
# Leave those components as rows of matrix so that it is compatible with Sklearn PCA.
def transform(self, matrix):
data = matrix - self.mean_
result = data @ self.components_.T
return result
def inverse_transform(self, matrix):
result = matrix @ np.conj(self.components_)
return self.mean_ + result
Licensed under: CC-BY-SA with attribution
Not affiliated with datascience.stackexchange