Question

I found a link about multinomial naive bayes classifier

multinomial naive bayes link

How we could calculate the B' or |V|?

The page said that it is the number of terms in the vocabulary. In its example, how we could get 6 for B? Is it the counting of all term?

"chinese", "beijing", "shanghai", "meacao", "tokyo", "japan"

One more question, what if new term appear in testing document? example, in doc 6 appears "bangkok" or any new word that never appear before. how to count the probability of new term ?

Was it helpful?

Solution

You are right. It's the total number of words in the vocabulary, since there can be only one entry for a term in the vocabulary.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top