Question

I have a csv file (tab deliminated) which contains 2 columns which looks like such

5   0
6   0
9   0
8   1
"+5000 lines similar lines"

I am attempting to create a ROC plot using ROCR.

This is what I have tried so far:

p<-read.csv(file="forROC.csv", head=TRUE, sep="\t")
pred<-prediction(p[1],p[2])

The second line gives me an error: Error in prediction(p[1], p[2]) : Number of classes is not equal to 2. ROCR currently supports only evaluation of binary classification tasks.

I am not sure what the error is. Is there something wrong with my CSV file?

Was it helpful?

Solution

My guess is that your array indexing isn't setup properly. If you read in that CSV file, you should expect a data.frame (think matrix or 2D array, depending on your background) with two columns and 5,000+ rows.

So your current call to p[1] or p[2] aren't especially meaningful. You probably want to access the first and second column of that data.frame, which you can do using the syntax of p[,1] for the first column and p[,2] for the second.

The specific error you're encountering, however, is a complaint that the "truth" variable you're using isn't binary. It seems that your file is setup to have an output of 1 and 0, so this error may go away once you properly access your array. But if you encounter this in the future, just be sure to binarize your truth data before you use it. For instance:

p[,2] <- p[,2] != 0

Would set the values to FALSE if it's a zero, and TRUE for each non-zero cell in the column.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top