Question

I have problem about calculating the precision and recall for classifier in matlab. I use fisherIris data (that consists of 150 datapoints, 50-setosa, 50-versicolor, 50-virginica). I have classified using kNN algorithm. Here is my confusion matrix:

50     0     0
 0    48     2
 0     4    46

correct classification rate is 96% (144/150), but how to calculate precision and recall using matlab, is there any function? I know the formulas for that precision=tp/(tp+fp),and recall=tp/(tp+fn), but I am lost in identifying components. For instance, can I say that true positive is 144 from the matrix? what about false positive and false negative? Please help!!! I would really appreciate! Thank you!

Était-ce utile?

La solution

To add to pederpansen's answer, here are some anonymous Matlab functions for calculating precision, recall and F1-score for each class, and the mean F1 score over all classes:

precision = @(confusionMat) diag(confusionMat)./sum(confusionMat,2);

recall = @(confusionMat) diag(confusionMat)./sum(confusionMat,1)';

f1Scores = @(confusionMat) 2*(precision(confusionMat).*recall(confusionMat))./(precision(confusionMat)+recall(confusionMat))

meanF1 = @(confusionMat) mean(f1Scores(confusionMat))

Autres conseils

As Dan pointed out in his comment, precision and recall are usually defined for binary classification problems only.

But you can calculate precision and recall separately for each class. Let's annotate your confusion matrix a little bit:

          |                  true           |
          |      |  seto  |  vers  |  virg  |
          -----------------------------------
          | seto |   50        0        0
predicted | vers |    0       48        2
          | virg |    0        4       46

Here I assumed the usual convention holds, i.e. columns are used for true values and rows for values predicted by your learning algorithm. (If your matrix was built the other way round, simply take the transpose of the confusion matrix.)

The true positives (tp(i)) for each class (=row/column index) i is given by the diagonal element in that row/column. The true negatives (tn) then are given by the sum of the remaining diagonal elements. Note that we simply define the negatives for each class i as "not class i".

If we define false positives (fp) and false negatives (fn) analogously as the sum of off-diagonal entries in a given row or column, respectively, we can calculate precision and recall for each class:

precision(seto) = tp(seto) / (tp(seto) + fp(seto)) = 50 / (50 + (0 + 0)) = 1.0
precision(vers) = 48 / (48 + (0 + 2)) = 0.96
precision(virg) = 46 / (46 + (0 + 4)) = 0.92

recall(seto) = tp(seto) / (tp(seto) + fn(seto)) = 50 / (50 + (0 + 0)) = 1.0
recall(vers) = 48 / (48 + (0 + 4)) = 0.9231
recall(virg) = 46 / (46 + (0 + 2)) = 0.9583

Here I used the class names instead of the row indices for illustration.

Please have a look at the answers to this question for further information on performance measures in the case of multi-class classification problems - particularly if you want to end up with single number instead of one number for each class. Of course, the easiest way to do this is just averaging the values for each class.

Update

I realized that you were actually looking for a Matlab function to do this. I don't think there is any built-in function, and on the Matlab File Exchange I only found a function for binary classification problems. However, the task is so easy you can easily define your own functions like so:

function y = precision(M)
  y = diag(M) ./ sum(M,2);
end

function y = recall(M)
  y = diag(M) ./ sum(M,1)';
end

This will return a column vector containing the precision and recall values for each class, respectively. Now you can simply call

>> mean(precision(M))

ans =

    0.9600

>> mean(recall(M))

ans =

    0.9605

to obtain the average precision and recall values of your model.

use the following matab code

   actual = ...
   predicted= ...
   cm = confusionmat(actual,predicted);
   cm = cm';
   precision = diag(cm)./sum(cm,2);
   overall_precision = mean(precision)
   recall= diag(cm)./sum(cm,1)';
   overall_recall = mean(recall)

Another approach

   confMat=[50,0,0;0,48,2;0,4,46];

for i =1:size(confMat,1)
    precision(i)=confMat(i,i)/sum(confMat(i,:)); 
end
precision(isnan(precision))=[];
Precision=sum(precision)/size(confMat,1);

for i =1:size(confMat,1)
    recall(i)=confMat(i,i)/sum(confMat(:,i));  
end

Recall=sum(recall)/size(confMat,1);

F_score=2*Recall*Precision/(Precision+Recall);
Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top