Question

Given that I may have an ARFF file that is written in the following form:

@relation spamOrNot
@attribute body String
@attribute result {spam, notspam}
"free money now!", spam
"hi meet me at 10", notspam

And I were to run this to train a Naive Bayes Classifier on Weka. How would I create a test-set so that this trained classifier would be able to make predictions? Thanks.

Was it helpful?

Solution

There are many data repositories where you can find spam and non-spam mail examples (from real-life). Then, take the body of the spam/ non-spam mails and make a similar arff file like this one (but this time it will be testing arff file) with appropriate labels (spam for spam mail bodies and notspam for non-spam mail bodies). With your training arff file, you will get a classifier model. Save that model. Then on Weka explorer classify tab, select the radio button "test set" and select your test set (it will have the labels though). Then load your training model, right click on it and then select re-evaluate with test set. You are done.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top