Domanda

When selecting a probability threshold to maximize the F1 score prior to deploying a model (based on the precision-recall curve), should the threshold be selected based on the training or holdout dataset?

È stato utile?

Soluzione

Ideally, the threshold should be selected on your training set. Your holdout set is just there to double confirm that whatever has worked on your training set will generalize to images outside of the training set.

This is the reason why hyperparameters tuning like GridSearch and RandomizedSearch in python has a cv parameter to cross-validate between different folds of your training set instead of allowing to choose the best parameters based on metric measured using the holdout set.

Autorizzato sotto: CC-BY-SA insieme a attribuzione
scroll top