Вопрос

I'm looking for a Multinomial Naive Bayes classifier written in C/C++ for use with OpenCV.

I'm looking for the Algorithm (or a readymade implementation) as it will be more helpful as I'm trying to understand on how it works?

Это было полезно?

Решение

Naive Bayes Classifier is a well-known classification algorithm. especially in the field of text classification, so I will take it for explaining.

Assuming we have some training document {d1 , d2 , d3 , ... , dm} where each document can be represented by a collection of words {w1,w2,w3, ... , wn} and each document belongs to some predefined set of class (take binary case (c_0,c_1) here) Our task is to classify some new input document d into either class c_0 or class c_1.

An intuitive way would be to take maximum likelihood estimation: that is,

output c_0 if P(d | c_0) > P(d | c_1) and vice versa.

so by our definition of d, we can write the criterion by

P(d | c_0) = P( {w1,w2,w3...,wn}  | c_0)

since calculating this joint probability given class is so complicated. So we make a strong assumption that words are mutually independent conditioned on class. So that leads us to

P(d | c_0) = P({w1,w2,w3...,wn} | c_0) = P(w1|c_0)*P(w2|c_0)*P(w2|c_0)...*P(wn|c_0)

where each P(w | c) can be easily computed as frequency count of word w in class c.

this strong assumption is the reason for the name "Naive", since we just naively do series multiplication for each word.

finally taking answer = argmax P(d | c_0) , P(d | c_1) would end this algorithm

I guess in your domain what you're looking is similar to text classification, except the feature you need to extract is different.

Лицензировано под: CC-BY-SA с атрибуция
Не связан с StackOverflow
scroll top