Question

How can I use scikit learn or any other python library to draw a roc curve for a csv file such as this:

1, 0.202
0, 0.203
0, 0.266
1, 0.264
0, 0.261
0, 0.291
.......
Was it helpful?

Solution

import pandas as pd
import numpy as np
import pylab as pl
from sklearn.metrics import roc_curve, auc

df = pd.read_csv('filename.csv')

y_test = np.array(df)[:,0]
probas = np.array(df)[:,1]

# Compute ROC curve and area the curve
fpr, tpr, thresholds = roc_curve(y_test, probas)
roc_auc = auc(fpr, tpr)
print("Area under the ROC curve : %f" % roc_auc)

# Plot ROC curve
pl.clf()
pl.plot(fpr, tpr, label='ROC curve (area = %0.2f)' % roc_auc)
pl.plot([0, 1], [0, 1], 'k--')
pl.xlim([0.0, 1.0])
pl.ylim([0.0, 1.0])
pl.xlabel('False Positive Rate')
pl.ylabel('True Positive Rate')
pl.title('Receiver operating characteristic')
pl.legend(loc="lower right")
pl.show()

OTHER TIPS

Not an answer for python, but if you use R (http://www.r-project.org/) it is as easy as

# load data
X <- read.table("mydata.csv", sep = ",")

# create and plot RoC curve
library(ROCR)
roc <- ROCR::performance(ROCR::prediction(X[,2], X[,1]), "tpr", "fpr")
plot(roc)

(you need to install R package ROCR beforehand via install.package("ROCR"))

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top