My GridSearchCV for my random forest breaks up. I need to know the reason and the solution to make it work:

# Grid-Search for Random Forest
param_grid = {
    'bootstrap': [True],
    'n_estimators': [100, 200, 300, 400, 500],
    'max_depth': [50, 100, None],
    'max_features': ['auto', 200],
    'min_impurity_decrease':[0],
    'min_samples_split': [2, 5],
    'min_samples_leaf': [2, 5],
    'oob_score': [True],
    'warm_start': [True]    
}


# Base-Model for improvement
rf_gridsearch = RandomForestRegressor(random_state=42)

# Grid-Search initiation 
rf_gridsearch = GridSearchCV(estimator = rf_gridsearch, param_grid = param_grid, 
                           scoring = 'neg_mean_absolute_error', cv = 5, 
                           n_jobs = -1, verbose = 5)

# Perform the grid search for the model
rf_gridsearch.fit(X_train, y_train)
```
有帮助吗?

解决方案

First, you are fitting $5 \cdot 3\cdot2\cdot2\cdot2\cdot5=600$ models and n_estimator=500 is quite big. Of course, this depends on your dataset and in your computing power.

My first guess will be that you have not enough RAM memory on your laptop(if you are running it there) and that is why it is collapsing.

If the error is this one, I recommend sampling your data to 1/10 or less (depending on your data) and searching for the best hyperparameter there and then using your whole data for the final model.

许可以下: CC-BY-SA归因
scroll top