#training the model model_1_features = ['sqft_living', 'bathrooms', 'bedrooms', 'lat', 'long'] model_2_features = model_1_features + ['bed_bath_rooms'] model_3_features = model_2_features + ['bedrooms_squared', 'log_sqft_living', 'lat_plus_long'] model_1 = linear_model.LinearRegression() model_1.fit(train_data[model_1_features], train_data['price']) model_2 = linear_model.LinearRegression() model_2.fit(train_data[model_2_features], train_data['price']) model_3 = linear_model.LinearRegression() model_3.fit(train_data[model_3_features], train_data['price']) # extracting the coef print model_1.coef_ print model_2.coef_ print model_3.coef_
If I change the order of the features, the coef are still printed in the same order, hence I would like to know the mapping of the feature with the coeff
How to Find the Regression Coefficient. A regression coefficient is the same thing as the slope of the line of the regression equation. The equation for the regression coefficient that you'll find on the AP Statistics test is: B1 = b1 = Σ [ (xi – x)(yi – y) ] / Σ [ (xi – x)2].
import numpy as np # Simulate data using a quadratic equation with coefficients y=ax^2+bx+c a, b, c = 1, 2, 3 x = np. arange(100) # Add random component to y values for estimation y = a*x**2 + b*x + c + np. random. randn(100) # Get X matrix [100x3] X = np.
If you want to extract a summary of a regression model in Python, you should use the statsmodels package. The code below demonstrates how to use this package to fit the same multiple linear regression model as in the earlier example and obtain the model summary.
The trick is that right after you have trained your model, you know the order of the coefficients:
model_1 = linear_model.LinearRegression() model_1.fit(train_data[model_1_features], train_data['price']) print(list(zip(model_1.coef_, model_1_features)))
This will print the coefficients and the correct feature. (Tested with pandas DataFrame)
If you want to reuse the coefficients later you can also put them in a dictionary:
coef_dict = {} for coef, feat in zip(model_1.coef_,model_1_features): coef_dict[feat] = coef
(You can test it for yourself by training two models with the same features but, as you said, shuffled order of features.)
@Robin posted a great answer, but for me I had to make one tweak on it to work the way I wanted, and it was to refer to the dimension of the 'coef_' np.array that I wanted, namely modifying to this: model_1.coef_[0,:], as below:
coef_dict = {} for coef, feat in zip(model_1.coef_[0,:],model_1_features): coef_dict[feat] = coef
Then the dict was created as I pictured it, with {'feature_name' : coefficient_value} pairs.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With