Adjust predicted probability after smote
-
06-12-2019 - |
Вопрос
i have an imbalance data set and I used smote to oversample the minority class and undersample the majority class. now, I want to check the test AUC using predict_proba of the model.
I have two questions: 1. Do I have to correct the probability if I am comparing AUCs? 2. How can I correct it (a combination of undersampling and oversampling!)
Решение
No, any adjustment to the probabilities will presumably be monotonic, so the rank-ordering of the predictions will be the same, so the AUC will be the same.
See, e.g., https://datascience.stackexchange.com/a/58899/55122
See also the more complex "probability calibration" techniques.
Also, if you see better results after smote+undersampling, and can share your data and work, I'd be very interested. I haven't yet seen an example where training on the original dataset doesn't do just as well (with appropriate thresholding).