How to choose a model for this cross-validation curve?

https://datascience.stackexchange.com/questions/62147

model-selection
gridsearchcv

02-11-2019
|

Pergunta

I'm using GridSearchCV to tune hyperparameters for a Logistic Regression multiclass model.

I read on Kaggle that you should choose the hyperparameter that results in the lowest discrepancy between the CV-score and the training score, but in this case this leads to a very low score.

How should I choose the proper C value to ensure generalisability of the model but also high model performance based on the CV-curve below?

From my understanding opting for low discrepency between the two scores ensures the ability of the model to be generalised to unseen data. But on the other hand I want a score as high as possible on unseen data.

Thanks for any help!

Nenhuma solução correta

Licenciado em: CC-BY-SA com atribuição

Não afiliado a datascience.stackexchange