How to combine GridSearchCV with Early Stopping?

https://datascience.stackexchange.com//questions/63219

29-11-2019
|

Pergunta

I'm a beginner in machine learning and want to train a CNN (for image recognition) with optimized hyperparameter like dropout rate, learning rate and number of epochs.

The optimal hyperparameter I try to find via GridSearchCV from Scikit-learn. I have often read that GridSearchCV can be used in combination with early stopping, but I can not find a sample code in which this is demonstrated.

With EarlyStopping I would try to find the optimal number of epochs, but I don't know how I can combine EarlyStopping with GridSearchCV or at least with cross validation.

Can anyone give me a hint on how to do that, it would be a great help?

My current code looks like this:

def create_model(dropout_rate_1=0.0, dropout_rate_2=0.0, learn_rate=0.001):
    model = Sequential()
    model.add(Conv2D(32, kernel_size=(3,3), input_shape=(28,28,1), activation='relu', padding='same')
    model.add(Conv2D(32, kernel_size=(3,3), activation='relu', padding='same')
    model.add(MaxPooling2D(pool_size=(2,2)))
    model.add(Dropout(dropout_rate_1))
    model.add(Dense(128, activation='relu'))
    model.add(Dropout(dropout_rate_2))
    model.add(Dense(10, activation='softmax'))
    optimizer=Adam(lr=learn_rate)
    model.compile(loss='categorical_crossentropy', optimizer=optimizer, 
                             metrics=['accuracy'])
    return model

model = KerasClassifier(build_fn=create_model, epochs=50, batch_size=10, verbose=0)
epochs = [30, 40, 50, 60]
dropout_rate_1 = [0.0, 0.2, 0.4, 0.6]
dropout_rate_2 = [0.0, 0.2, 0.4, 0.6]
learn_rate = [0.0001, 0.001, 0.01]
param_grid = dict(dropout_rate_1=dropout_rate_1, dropout_rate_2=dropout_rate_2,
                        learn_rate=learn_rate, epochs=epochs)
grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=-1, cv=5)
grid_result = grid.fit(X, y)

Solução

Just to add to others here. I guess you simply need to include a early stopping callback in your fit().

Something like:

# Define early stopping
early_stopping = EarlyStopping(monitor='val_loss', patience=epochs_to_wait_for_improve)
# Add ES into fit
history = model.fit(..., callbacks=[early_stopping])

Outras dicas

If you would ask for code suggestion please specify your framework in the future. I am assuming you are using Keras

I can make you a minimum viable implementation of your case.

from sklearn.base import ClassifierMixin, BaseEstimator

class CNN_model(ClassifierMixin, BaseEstimator) :
      def __init__(**model_params) :
          """
              define model parameters within this init function
          """
          self.model = # Use the params above and make a keras model and store it in this variable
          self.model.compile(loss=  , optimizer =   , metrics=[]) # Please fill-in the appropriate loss and metrics

       def fit(X,y) :
          self.model.fit(X,y, training_params)
          # You specify everything in training_params e.g. epoch, callbacks(which includes early stopping)
          return self

       def predict(X) :
          return self.model.predict(X)

So basically the code above makes a custom instance of sklearn estimator, which if you are succesfully build can be combined with GridSearchCV.

GridSearchCv with Early Stopping - I was curious about your question. As long as the algorithms has built in Early Stopper feature, you can use it in this manner.

when it comes to other algorithms, It might not serve the purpose of early stopping because you never know what parameters are gonna be the best until you experiment with them.

@Syenix I know how Early Stopping works, but when (at what time?) should it usually be used? In other words, is Early Stopping used after optimizing the hyperparameters or while optimizing the hyperparams via GridSearchCV? So far, I have not found any insightful explanation of how to use Early Stopping and GridSearchCV in the correct order?

Licenciado em: CC-BY-SA com atribuição

Não afiliado a datascience.stackexchange