Question

How can I use scikit learn or any other python library to draw a roc curve for a csv file such as this:

1, 0.202
0, 0.203
0, 0.266
1, 0.264
0, 0.261
0, 0.291
.......
Était-ce utile?

La solution

import pandas as pd
import numpy as np
import pylab as pl
from sklearn.metrics import roc_curve, auc

df = pd.read_csv('filename.csv')

y_test = np.array(df)[:,0]
probas = np.array(df)[:,1]

# Compute ROC curve and area the curve
fpr, tpr, thresholds = roc_curve(y_test, probas)
roc_auc = auc(fpr, tpr)
print("Area under the ROC curve : %f" % roc_auc)

# Plot ROC curve
pl.clf()
pl.plot(fpr, tpr, label='ROC curve (area = %0.2f)' % roc_auc)
pl.plot([0, 1], [0, 1], 'k--')
pl.xlim([0.0, 1.0])
pl.ylim([0.0, 1.0])
pl.xlabel('False Positive Rate')
pl.ylabel('True Positive Rate')
pl.title('Receiver operating characteristic')
pl.legend(loc="lower right")
pl.show()

Autres conseils

Not an answer for python, but if you use R (http://www.r-project.org/) it is as easy as

# load data
X <- read.table("mydata.csv", sep = ",")

# create and plot RoC curve
library(ROCR)
roc <- ROCR::performance(ROCR::prediction(X[,2], X[,1]), "tpr", "fpr")
plot(roc)

(you need to install R package ROCR beforehand via install.package("ROCR"))

Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top