문제

I'm trying to understand how the base value is calculated. So I used an example from SHAP's github notebook, Census income classification with LightGBM.

Right after I trained the lightgbm model, I applied explainer.shap_values() on each row of the test set individually. By using force_plot(), it yields the base value, model output value, and the contributions of features, as shown below: enter image description here

My understanding is that the base value is derived when the model has no features. But how is it actually calculated in SHAP?

도움이 되었습니까?

해결책

As you say, it's the value of a feature-less model, which generally is the average of the outcome variable in the training set (often in log-odds, if classification). With force_plot, you actually pass your desired base value as the first parameter; in that notebook's case it is explainer.expected_value[1], the average of the second class.

https://github.com/slundberg/shap/blob/06c9d18f3dd014e9ed037a084f48bfaf1bc8f75a/shap/plots/force.py#L31

https://github.com/slundberg/shap/issues/352#issuecomment-447485624

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 datascience.stackexchange
scroll top