1) How can i apply feature reduction methods like LSI etc in weka for text classification?

2) Do applying feature reduction methods like LSI etc can improve the accuracy of classification ?

有帮助吗?

解决方案

  1. Take a look at FilteredClassifier class or at AttributeSelectedClassifier. With FilteredClassifier you can use such features reduction method as Principal Component Analysis (PCA). Here is a video how to filter your dataset using PCA, so that you could try different classifiers on reduced dataset.

  2. It can help, but there is no guarantee about that. If you remove redundant features, or transform features in some way (like SVM or PCA do) classification task can become simpler. Anyway big number of features usually lead to curse of dimensionality and attribute selection is a way to avoid it.

许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top