質問

I'm getting a memory error trying to do KernelPCA on a data set of 30.000 texts. RandomizedPCA works alright. I think what's happening is that RandomizedPCA works with sparse arrays and KernelPCA don't.

Does anyone have a list of learning methods that are currently implemented with sparse array support in scikits-learn?

役に立ちましたか?

解決

We don't have that yet. You have to read the docstrings of the individual classes for now.

Anyway, non linear models do not tend to work better than linear model for high dim sparse data such as text documents (and they can overfit more easily).

ライセンス: CC-BY-SA帰属
所属していません StackOverflow
scroll top