Is it conscientious to use a threshold for a model output in order to play on the recall and precision?

https://datascience.stackexchange.com/questions/74930

11-12-2020
|

Question

I have just finished reading an article about the F1 score, recall and precision. Everything was clear except the fact that the author, in his example (see https://towardsdatascience.com/beyond-accuracy-precision-and-recall-3da06bea9f6c#a9f0), is using a threshold for his model output to play on the F1 score, recall and precision values.

So my question is: is it conscientious to use a threshold as the author did in his article?

In my opinion, this is like a kind of manual over-fitting, but maybe there is something I misunderstood...

Solution

Good question. Using a threshold is perfectly fine and is not "manual overfitting".

It is not manual because this is a step that can (and should) be done automatically. It is not overfitting as it doesn't modify the model itself. It modifies how you interpret the model's output.

What the user did is actual called cost-sensitive learning. It is a technique where you define which error is more costly and these costs are reflected in the performance metric you use.

I can see why it may feel like overfitting but it is important to understand that all you are doing here is handcrafting a performance metric with which you will evaluate your model. Your model will simply try to optimize this metric.

Licensed under: CC-BY-SA with attribution

Not affiliated with datascience.stackexchange