ROC curve from the result of a classification or clustering
-
25-05-2021 - |
Вопрос
Say that I've clustered a training dataset of 5 classes containing 1000 instances, to 5 clusters (centers) using for example k-means. Then I've constructed a confusion matrix by validating on a test dataset. I want then to use plot a ROC curve from this, how is it possible to do that ?
Решение
Roc Curves show trade-off between True Positive and False Positive Rate. In other words
ROC graphs are two-dimensional graphs in which TP rate is plotted on the Y axis and FP rate is plotted on the X axis ROC Graphs: Notes and Practical Considerations for Researchers
When you use a discrete classifier, that classifier produces only a single point in ROC Space. Normally you need a classifier which produces probabilities. You change your parameters in classifier so that your TP and FP rates change. After that you use this points to draw a ROC curve.
Lets say you use k-means. K-means give you cluster membership discretely. A point belongs to ClusterA or .. ClusterE. Therefore outputting ROC curve from k-means is not straightforward. Lee and Fujita describes an algorithm for this. You should look to their paper. But algorithm is something like this.
- Apply k-means
- calculate TP and FP using test data.
- change membership of data points from one cluster to second cluster.
- calculate TP and FP using test data again.
As you see they get more points in ROC space and use these points to draw ROC curve