Question

I want to know how L1 & L2 regularization works in Light GBM and how to interpret the feature importances.

Scenario is: I used LGBM Regressor with RandomizedSearchCV (cv=3, iterations=50) on a dataset of 400000 observations & 160 variables. In order to avoid overfitting/reguralize I provided below ranges for alpha/L1 & lambda/L2 parameters and the best values as per Randomized search are 1 & 0.5 respectively.

'reg_lambda': [0.5, 1, 3, 5, 10] 'reg_alpha': [0.5, 1, 3, 5, 10]

Now my question is about: Feature importance values with optimized values of reg_lambda=1 & reg_alpha=0.5 are very different from that without providing any input for reg_lambda & alpha. The regularized model considers only top 5-6 features important and makes importance values of other features as good as zero (Refer images). Is that a normal behaviour of L1/L2 regularization in LGBM?

Further explaining the LGBM output with L1/L2: The top 5 important features are same in both the cases (with/without regularization), however importance values after top 2 features has been shrunk significantly by the L1/L2 regularized model and after top 5 features the regularized model makes importance values as good as zero (Refer images of feature importance values in both cases).

Another related question I have is: How to interpret the importance values and when I run the LGBM model with Randomized search cv best parameters do I need to remove the features with low importance values & then run the model? OR should I run with all the features & the LGBM algorithm (with L1 & L2 regularization) will take care of low importance features and won't give them any weight or may be give minute weight when it makes predictions.

Any help will be highly appreciated.

Regards Vikrant

Feature Importances with L1 & L2 Regularization Feature Importances without L1 & L2 Regularization

No correct solution

Licensed under: CC-BY-SA with attribution
Not affiliated with datascience.stackexchange
scroll top