feature_importances_ showing up as NoneType in ExtraTreesClassifier :TypeError: 'NoneType' object is not iterable

https://stackoverflow.com/questions/21360960

02-10-2022
|

Domanda

I am trying to select important features (or at least understand which features explain more variabilty) for a given dataset. Towards this I use both ExtraTreesClassifier and GradientBoostingRegressor - and then use :-

clf = ExtraTreesClassifier(n_estimators=10,max_features='auto',random_state=0) # stops after 10 estimation passes, right ?
clf.fit(x_train, y_train)
feature_importance=clf.feature_importances_  # does NOT work - returns NoneType for feature_importance

Post this I am really interested in plotting them(for visual representation) - or even preliminary, just looking at the relative order of importance and the corresponding indices

# Both of these do not work as the feature_importance is of NoneType
feature_importance = 100.0 * (feature_importance / feature_importance.max())
indices = numpy.argsort(feature_importance)[::-1]

What I found puzzling was - if I were to use GradientBoostingRegressor as below, I do get the feature_importance and the indices thereof. What am I doing wrong ?

#Works with GradientBoostingRegressor
params = {'n_estimators': 100, 'max_depth': 3, 'learning_rate': 0.1, 'loss': 'lad'}
clf = GradientBoostingRegressor(**params).fit(x_train, y_train)
clf.fit(x_train, y_train)
feature_importance=clf.feature_importances_

other info : I have 12 independent vars(x_train) and one label var(y_train)) with multiple values (say 4,5,7) and type(x_train) is and type(feature_importance) is

Acknowledgments : Some elements are borrowed from this post http://www.tonicebrian.com/2012/11/05/training-gradient-boosting-trees-with-python/

Soluzione

When initializing an ExtraTreeClassifier, there is an option compute_importances which defaults to None. In other words, you need to initialize ExtraTreeClassifier as

clf = ExtraTreesClassifier(n_estimators=10,max_features='auto',random_state=0,compute_importances=True)

so that it will compute the feature importance.

Where as for GradientBoostedRegressor, there is no such option and feature importance will always be computed.

Autorizzato sotto: CC-BY-SA insieme a attribuzione

Non affiliato a StackOverflow