Weka - How to find input format for classifiers

https://stackoverflow.com/questions/18726665

28-06-2022
|

Question

I am using Weka in a Java program to classify some text documents, and have it working well with the NaiveBayesMultinomial classifier.

However I can't seem to find any documentation on how I might filter my Instances (or ARFF file) so that they can be accepted as input by the other classifiers. If I load the ARFF into the Weka Explorer GUI then most of the classifiers are greyed out. Using the StringToWordVector filter doesn't affect this, and I have tried a few others as well.

Can anyone tell me how I can prepare my data so it can be accepted by other classifiers, for example NaiveBayes, JRip or BayesNet?

Solution

At the WEKA Explorer GUI, when you apply the StringToWordVector, the former class attribute is most often moved to be the first attribute, so it is not detected as the class by default. When you are at the Classify tab, please ensure that the correct attribute is selected as the class for your experiment.

Another potential source of problems is that the class is numeric, thus preventing some algorithms (which expect a nominal class) to be applied.

In case this does not solve your problem, please post an excerpt of your ARFF file (header extract plus one instance) in order to allow us to provide more precise advice.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow