Question

I'm performing a logistic regression on my training data. I used the glm function to get the model m. Now using the below codes from this link, I calculated AUC

$test\$score<-predict(m,type = 'response',test)$
$pred <- prediction(test\$score,test\$good_bad)$
$perf <- performance(pred,"tpr","fpr")$

where score is the dependent variable (0 or 1).
To score the tpr (True positive rate) and fpr (False positive rate), you have to classify the predicted probabilities into 1 or 0.
What is the cutoff used for that? how can we change it?

Could not find anything useful in this main documentation as well.

Was it helpful?

Solution

I cant access an R console at the moment to check, but I'm quite certain the cutoff is 0.5: if your glm model does prediction, it first produces real values and then applies the link function on top. To the best of my knowledge, you can't change it inside the glm function, so your best bet is probably to check ROC, find what the optimal threshold is and use that as cutoff.

OTHER TIPS

If you're not sure which label ROCR took as +ve then check str(pred_obj@labels) and the greater one shown is considered +ve; if you want to change that then use label.ordering argument supplying a vector containing -ve and +ve labels while creating prediction object

Licensed under: CC-BY-SA with attribution
Not affiliated with datascience.stackexchange
scroll top