python, scikits-learn: which learning methods support sparse feature vectors?

https://stackoverflow.com/questions/10304280

python
machine-learning
scikits
scikit-learn

03-06-2021
|

Question

I'm getting a memory error trying to do KernelPCA on a data set of 30.000 texts. RandomizedPCA works alright. I think what's happening is that RandomizedPCA works with sparse arrays and KernelPCA don't.

Does anyone have a list of learning methods that are currently implemented with sparse array support in scikits-learn?

Solution

We don't have that yet. You have to read the docstrings of the individual classes for now.

Anyway, non linear models do not tend to work better than linear model for high dim sparse data such as text documents (and they can overfit more easily).

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow