Frage

1) How can i apply feature reduction methods like LSI etc in weka for text classification?

2) Do applying feature reduction methods like LSI etc can improve the accuracy of classification ?

War es hilfreich?

Lösung

  1. Take a look at FilteredClassifier class or at AttributeSelectedClassifier. With FilteredClassifier you can use such features reduction method as Principal Component Analysis (PCA). Here is a video how to filter your dataset using PCA, so that you could try different classifiers on reduced dataset.

  2. It can help, but there is no guarantee about that. If you remove redundant features, or transform features in some way (like SVM or PCA do) classification task can become simpler. Anyway big number of features usually lead to curse of dimensionality and attribute selection is a way to avoid it.

Lizenziert unter: CC-BY-SA mit Zuschreibung
Nicht verbunden mit StackOverflow
scroll top