How can I change threshold for classification in NaiveBayesMultinomial or compute confusion matrix manually in Weka

Question 1

I searched around google and it seems it is unlikely to do so in WEKA.

But this is still feasible to do by 'Test option' -> 'More option' -> 'output predictions' Then it will give me the possibility result of each test sample.

From there I can use another tool for the rest of the work.

Question 2

You can change it in the cost benefit analysis screen. You right click on your results in the result list and select visualize threshold curve.

Inside of there is a slider to move the threshold and your new confusion matrix is in the bottom left hand corner.

enter image description here

Question 3

The probability threshold can be adjusted by using cost-sensitive classification.

If the desired threshold is k, set the cost of false positives μ and the cost of false negatives λ such that:

k = μ / (μ + λ)

For example, if you want a threshold of 0.4, set μ to 2 and λ to 3. In other words, use a cost matrix of:

0 3
2 0

Reference: More Data Mining with Weka — Lesson 4.6 Cost-sensitive classification vs. cost-sensitive learning (slides).

Explanation of formula:

In Naive Bayes with two classes, if class A has a probability of p, then class B has a probability of (1 - p).

If the threshold is 0.5, we classify as class A if we get p > 0.5, or in other words, p > (1 - p).

Suppose the cost of misclassifying A as B (false negative) is C_a, and the cost of misclassifying B as A (false positive) is C_b. Then, we only classify as class A if the probability-weighted cost of misclassifying A as B is greater than the probability-weighted cost of misclassifying B as A. In other words, classify as A if this is true:

C_a * p > C_b * (1 - p)

Rearranging the inequality, we get:

p > C_b / (C_a + C_b)