Pergunta

My GridSearchCV for my random forest breaks up. I need to know the reason and the solution to make it work:

# Grid-Search for Random Forest
param_grid = {
    'bootstrap': [True],
    'n_estimators': [100, 200, 300, 400, 500],
    'max_depth': [50, 100, None],
    'max_features': ['auto', 200],
    'min_impurity_decrease':[0],
    'min_samples_split': [2, 5],
    'min_samples_leaf': [2, 5],
    'oob_score': [True],
    'warm_start': [True]    
}


# Base-Model for improvement
rf_gridsearch = RandomForestRegressor(random_state=42)

# Grid-Search initiation 
rf_gridsearch = GridSearchCV(estimator = rf_gridsearch, param_grid = param_grid, 
                           scoring = 'neg_mean_absolute_error', cv = 5, 
                           n_jobs = -1, verbose = 5)

# Perform the grid search for the model
rf_gridsearch.fit(X_train, y_train)
```
Foi útil?

Solução

First, you are fitting $5 \cdot 3\cdot2\cdot2\cdot2\cdot5=600$ models and n_estimator=500 is quite big. Of course, this depends on your dataset and in your computing power.

My first guess will be that you have not enough RAM memory on your laptop(if you are running it there) and that is why it is collapsing.

If the error is this one, I recommend sampling your data to 1/10 or less (depending on your data) and searching for the best hyperparameter there and then using your whole data for the final model.

Licenciado em: CC-BY-SA com atribuição
scroll top