Question

I am trying to calculate accuracy using ROCR package in R but the result is different than what I expected:

Assume I have prediction of a model (p) and label (l) as following:

p <- c(0.61, 0.36, 0.43, 0.14, 0.38, 0.24, 0.97, 0.89, 0.78, 0.86)
l <- c(1,     1,    1,    0,    0,     1,    1,    1,    0,     1)

And I am calculating accuracy of this prediction using following commands:

library(ROCR)
pred <- prediction(p, l)
perf <- performance(pred, "acc")
max(perf@y.values[[1]])

but the result is .8 which according to accuracy formula (TP+TN)/(TN+TP+FN+FP) should be .6 I don't know why?

Was it helpful?

Solution

When you use max(perf@y.values[[1]]), it is computing the maximum accuracy across any possible cutoff for predicting a positive.

In your case, the optimal threshold is p=0.2, at which you make 2 mistakes (on the observations with predicted probabilities 0.38 and 0.78), yielding a maximum accuracy of 0.8.

You can access the cutoffs for your perf object using perf@x.values[[1]].

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top