Frage

How can I use scikit learn or any other python library to draw a roc curve for a csv file such as this:

1, 0.202
0, 0.203
0, 0.266
1, 0.264
0, 0.261
0, 0.291
.......
War es hilfreich?

Lösung

import pandas as pd
import numpy as np
import pylab as pl
from sklearn.metrics import roc_curve, auc

df = pd.read_csv('filename.csv')

y_test = np.array(df)[:,0]
probas = np.array(df)[:,1]

# Compute ROC curve and area the curve
fpr, tpr, thresholds = roc_curve(y_test, probas)
roc_auc = auc(fpr, tpr)
print("Area under the ROC curve : %f" % roc_auc)

# Plot ROC curve
pl.clf()
pl.plot(fpr, tpr, label='ROC curve (area = %0.2f)' % roc_auc)
pl.plot([0, 1], [0, 1], 'k--')
pl.xlim([0.0, 1.0])
pl.ylim([0.0, 1.0])
pl.xlabel('False Positive Rate')
pl.ylabel('True Positive Rate')
pl.title('Receiver operating characteristic')
pl.legend(loc="lower right")
pl.show()

Andere Tipps

Not an answer for python, but if you use R (http://www.r-project.org/) it is as easy as

# load data
X <- read.table("mydata.csv", sep = ",")

# create and plot RoC curve
library(ROCR)
roc <- ROCR::performance(ROCR::prediction(X[,2], X[,1]), "tpr", "fpr")
plot(roc)

(you need to install R package ROCR beforehand via install.package("ROCR"))

Lizenziert unter: CC-BY-SA mit Zuschreibung
Nicht verbunden mit StackOverflow
scroll top