I used shap
to determine the feature importance for multiple regression with correlated features.
import numpy as np
import pandas as pd
from sklearn.linear_model import LinearRegression
from sklearn.datasets import load_boston
import shap
boston = load_boston()
regr = pd.DataFrame(boston.data)
regr.columns = boston.feature_names
regr['MEDV'] = boston.target
X = regr.drop('MEDV', axis = 1)
Y = regr['MEDV']
fit = LinearRegression().fit(X, Y)
explainer = shap.LinearExplainer(fit, X, feature_dependence = 'independent')
# I used 'independent' because the result is consistent with the ordinary
# shapely values where `correlated' is not
shap_values = explainer.shap_values(X)
shap.summary_plot(shap_values, X, plot_type = 'bar')
shap
offers a chart to get the shap values. Is there also a statistic available? I am interested in the exact shap values. I read the Github repository and the documentation but I found nothing regarding this topic.
When we look at shap_values
we see that it contains some positive and negative numbers, and its dimensions equal the dimensions of boston
dataset. Linear regression is a ML algorithm, which calculates optimal y = wx + b
, where y
is MEDV, x
is feature vector and w
is a vector of weights. In my opinion, shap_values
stores wx
- a matrix with the value of the each feauture multiplyed by the vector of weights calclulated by linear regression.
So to calculate wanted statistics, I first extracted absolute values and then averaged over them. The order is important! Next I used initial column names and sorted from biggest effect to smallest one. With this, I hope I have answered your question!:)
from matplotlib import pyplot as plt
#rataining only the size of effect
shap_values_abs = np.absolute(shap_values)
#dividing to get good numbers
means_norm = shap_values_abs.mean(axis = 0)/1e-15
#sorting values and names
idx = np.argsort(means_norm)
means = np.array(means_norm)[idx]
names = np.array(boston.feature_names)[idx]
#plotting
plt.figure(figsize=(10,10))
plt.barh(names, means)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With