What is done first, cross validation or grid search?

https://datascience.stackexchange.com/questions/68999

09-12-2020
|

Вопрос

When I have the data set to train a model with SVM, which procedure is performed first, cross validation or grid search? I have read this in a couple of books but I don't know in what order all this should be done. If cross-validation is first performed, what hyperparameters do I use there if I have not found the optimal values provided by the grid search? In addition, throughout this procedure, where should the confusion matrix be calculated?

Thanks in advance

Решение

Well, grid search involves finding best hyperparameters. Best according to what data set? a held out validation set. If that's what you mean by cross validation, then they necessarily happen simultaneously.

It doesn't really make sense to do something called cross validation before testing hyperparams - indeed, what would you be evaluating?

CV as in k-fold cross validation can also happen within each model fitting process in the search, to produce a better estimate of the loss (and its variance, which is useful in more sophisticated tuning procedures). I think this is less usual but valid.

It's possible to use CV when fitting the final model after hyperparameter search. It might give you a better estimate of the loss, or confusion matrix, as you compute many of them. But each model you fit isn't using all available data. I think it's probably more conventional to take the best model's parameters and loss / confusion matrix, from the fitting process, as an estimate of generalization, and then refit the final model on all data. This means no CV at that stage.

Другие советы

Both are done together!

Grid search is how you determine the hyperparameters of a model whereas, Cross-Validation is the process of running your model on a data separate from training datase to gauge your models performance on a particular hyperparameter.

Grid search allows you to have some potential hyperparameters among which you compare the model's performance through training on the training dataset and evaluating on validation dataset.

The metric you use depends on the final objective. You can choose the hyperparameters which has the minimum loss or maximum accuracy or maximum AUROC or maximum f1-score.

The confusion matrix that you mention can be computed during training as well as during validation. It just allows you to gauge the performance of your current model. The matrix will give you 4 numbers which is combined in different ways to obtain different metrics such as: accuracy, auroc and f1-score. Among these metrics which one to use depends on the final task you want to perform.

Лицензировано под: CC-BY-SA с атрибуция

Не связан с datascience.stackexchange