Вопрос

I'm getting a memory error trying to do KernelPCA on a data set of 30.000 texts. RandomizedPCA works alright. I think what's happening is that RandomizedPCA works with sparse arrays and KernelPCA don't.

Does anyone have a list of learning methods that are currently implemented with sparse array support in scikits-learn?

Это было полезно?

Решение

We don't have that yet. You have to read the docstrings of the individual classes for now.

Anyway, non linear models do not tend to work better than linear model for high dim sparse data such as text documents (and they can overfit more easily).

Лицензировано под: CC-BY-SA с атрибуция
Не связан с StackOverflow
scroll top