Selecting threshold for F1 Score

https://datascience.stackexchange.com/questions/77510

machine-learning
data-mining
class-imbalance
f1score

12-12-2020
|

Question

When selecting a probability threshold to maximize the F1 score prior to deploying a model (based on the precision-recall curve), should the threshold be selected based on the training or holdout dataset?

Solution

Ideally, the threshold should be selected on your training set. Your holdout set is just there to double confirm that whatever has worked on your training set will generalize to images outside of the training set.

This is the reason why hyperparameters tuning like GridSearch and RandomizedSearch in python has a cv parameter to cross-validate between different folds of your training set instead of allowing to choose the best parameters based on metric measured using the holdout set.

Licensed under: CC-BY-SA with attribution

Not affiliated with datascience.stackexchange