Question

I'm trying to understand how the base value is calculated. So I used an example from SHAP's github notebook, Census income classification with LightGBM.

Right after I trained the lightgbm model, I applied explainer.shap_values() on each row of the test set individually. By using force_plot(), it yields the base value, model output value, and the contributions of features, as shown below: enter image description here

My understanding is that the base value is derived when the model has no features. But how is it actually calculated in SHAP?

Était-ce utile?

La solution

As you say, it's the value of a feature-less model, which generally is the average of the outcome variable in the training set (often in log-odds, if classification). With force_plot, you actually pass your desired base value as the first parameter; in that notebook's case it is explainer.expected_value[1], the average of the second class.

https://github.com/slundberg/shap/blob/06c9d18f3dd014e9ed037a084f48bfaf1bc8f75a/shap/plots/force.py#L31

https://github.com/slundberg/shap/issues/352#issuecomment-447485624

Licencié sous: CC-BY-SA avec attribution
scroll top