Differences between StandardNaiveBayesClassifier and ComplementaryNaiveBayesClassifier in mahout

StackOverflow https://stackoverflow.com/questions/20694184

Domanda

Maybe my question is quite sophisticated, but I would like to know the main differences between StandardNaiveBayesClassifier and ComplementaryNaiveBayesClassifier algorithms in Mahout. Which one performs better on smaller amount of training data or it is data dependent issue? Which one is better for sentiment analysis? And some other aspecs...

Thank you in advance!

È stato utile?

Soluzione

Complement naive Bayes is an naive Bayes variant that tends to work better than the vanilla version when the classes in the training set are imbalanced. In short, it estimates feature probabilities for each class y based on the complement of y, i.e. on all other classes' samples, instead of on the training samples of class y itself.

Altri suggerimenti

The Compliment Naive Bayes (CNB) classifier improves upon the weakness of the Naive Bayes classifier by estimating parameters from data in all sentiment classes except the one which we are evaluating for.

1) Even though performance of the NaïveBayes is good it makes several poor assumptions such as data independence and the uneven training data for a particular class (skewed data). 2)Complemented Naïve Bayes is one of the NaïveBayes variant which tackles the poor assumptions made by the parent Naïve Bayes classifier such as the Uneven Training size (The most occurring class in training data dominates during actual classification) and the independence (All features or attributes are treated individually) assumptions.Skewed data refers to having more training examples for one class than another which causes the decision boundary weights to be biased. This in turn induces the classifier to unwittingly prefer one class over the other. To counter this problem Complement Naïve Bayes proposes a probability estimate parameter which uses data from all classes except c

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top