Question

I have the following ROC Curve:

ROC Curve

And it does not end in 1.0 because my predictions include zeros, for example

prediction = [0.9, 0.1, 0.8, 0.0]

For the ROC Curve, I take the top-k predictions, first {0.9}, then {0.9, 0.8} etc. And if there are no values > 0 in the prediction anymore, the prediction does not change anymore with increasing k.

So I can´t get a true negative value of zero, and since the false positive rate is fp/(fp+tn), the curve ends before it reaches 1.

Now, should I artificially use the zeros for predictions as well, or is it OK if the curve just ends like that? It feels wrong to use the zeros as well. Or am I missing something?

Was it helpful?

Solution

The ROC curve shows the possible tradeoffs between false positives and false negatives when setting the threshold at different values. On one extreme, you can set the threshold so low that you label everything as positive, giving you a false negative rate of 0 and a false positive rate of 1. On the other extreme, you can set the threshold so high that you label everything as negative, giving you a false negative rate of 1 and a false positive rate of 0.

While these degenerate cases are not useful in practice, they are still theoretically valid tradeoffs and are a normal part of the ROC curve.

OTHER TIPS

Yes, of course! As Antimony mentioned before, ROC curve is used for showing the trade-off between false positive and true positive rate. I remember, once I trained a neural net on a data and I got 0 for false positive rate (since the fp was 0) in 90% of time that I ran the model. which was great! Since my TPR was 1 most of the time, my ROC curve was kind of weird because it was mostly some points on y-axis (TPR axis).

Your model is working fine since your FPR does not go beyond some specific values.

Let me give you as example, for specific input variables, my model is working as bellow: Predicted output: [0.97, 5.78E-4, 6.15E-4] Real outputs: [1.0, 0.0, 0.0]

You can see that model is predicting perfectly since the first value which is the predicted value for a corresponding class of 1 is easily distinguishable from two other values. Also, because two other values, I mean [5.78E-4, 6.15E-4], are very tiny in comparison with 0.97. For every cut-off 0.97 is gonna be mapped on 1 and two other values are mapped on 0. We can see that no matter what the cut-off is the TPR is high and FPR is zero.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top