سؤال

I have a training.arff file, where each entry has 2000 features (attributes). I want to select the top n of those attributes using the Information Gain criteria. How can I do that using WEKA and the command line? I have checked online and it seems that it is a two stage process, because I have to use a ranker as the second step. Could someone explain me how to do it?

هل كانت مفيدة؟

المحلول

The way to do it is this:

java weka.filters.supervised.attribute.AttributeSelection \
-E "weka.attributeSelection.InfoGainAttributeEval" \
-S "weka.attributeSelection.Ranker -N 10" -i training.arff -o training_IG.arff

The -E option is to tell which class to use as evaluator, and the -S tells what search method to use (in this case ranking).

مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top