How do I deploy a model when using Stratified K fold?

https://datascience.stackexchange.com/questions/81279

13-12-2020
|

Question

I have used Stratified K fold for learning the model . Below is the python code:

>def stratified_cv_v1(X, y, clf, shuffle=True, n=10,):
>    stratified_k_fold = StratifiedKFold(n_splits=n,shuffle=shuffle)
>    y_pred_v1 = y.copy()
>    for ii, jj in stratified_k_fold.split(X,y): 
>        X_train, X_test = X[ii], X[jj]
>        y_train = y[ii]
>        clf_v2 = clf()
>        clf_v2.fit(X_train,y_train)
>        y_pred[jj] = clf.predict(X_test)
>    return y_pred_v1


>print(classification_report(y, stratified_cv_v1(X, y, GradientBoostingClassifier)))

Now how do I use the model to deploy on a new data set where I need to predict ?

Solution

k-fold CV is meant to evaluate the model. Once the evaluation is done and one is ready to move to deployment, there's no point using CV anymore: the method has been tested and validated, so one can reasonably assume that from now on applying the same method to the same kind of data will lead to the same level of performance. Thus the usual process is:

Train a final model on the full dataset (no CV, no testing)
Apply the model to new instances

Licensed under: CC-BY-SA with attribution

Not affiliated with datascience.stackexchange