Question

I am really confused about how to calculate Precision and Recall in Supervised machine learning algorithm using NB classifier

Say for example
1) I have two classes A,B
2) I have 10000 Documents out of which 2000 goes to training Sample set (class A=1000,class B=1000)
3) Now on basis of above training sample set classify rest 8000 documents using NB classifier
4) Now after classifying 5000 documents goes to class A and 3000 documents goes to class B
5) Now how to calculate Precision and Recall?

Please help me..

Thanks

Était-ce utile?

La solution

Hi you have to divide results into four groups -
True class A (TA) - correctly classified into class A
False class A (FA) - incorrectly classified into class A
True class B (TB) - correctly classified into class B
False class B (FB) - incorrectly classified into class B

precision = TA / (TA + FA)
recall = TA / (TA + FB)

You might also need accuracy and F-measure:

accuracy = (TA + TB) / (TA + TB + FA + FB)
f-measure = 2 * ((precision * recall)/(precision + recall))

More here:
http://en.wikipedia.org/wiki/Precision_and_recall#Definition_.28classification_context.29

Autres conseils

Let me explain a bit for clarity.

Suppose there are 9 dogs and some cats in a video and the image processing algorithm tells you there are 7 dogs in the scene, out of which only 4 are actually dogs (True positives) while the 3 were cats (False positives)

Precision tells us out of the items classified as dogs, how many where actually dogs

so Precision = True Positives/(True positives + False positives) = 4/(4+3) = 4/7

While recall tells out of the total number of dogs, how many dogs where actually found.

so Recall = True Positives/Total Number = True Positive/(True positive + False Negative) = 4/9


In your problem

You have to find precision and recall for class A and class B

For Class A

True positive = (Number of class A documents in the 5000 classified class A documents)

False positive = (Number of class B documents in the 5000 classified class A documents)

From the above you can find Precision.

Recall = True positive/(Total Number of class A documents used while testing)

Repeat the above for Class B to find its precision and recall.

Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top