need normalization before SelectKBest in python

Question

I need to select some features from dataset for a regression task. But the numerical values are from different ranges.

from sklearn.datasets import load_boston
from sklearn.feature_selection import SelectKBest, f_regression

X, y = load_boston(return_X_y=True)
X_new = SelectKBest(f_regression, k=2).fit_transform(X, y)

To increase the performance of regression model do I need to normalize X before SelectKBest method?

killian95 · Accepted Answer

The answer is that it depends on your data -- so you should try it to see if it helps! Here's a quick way to transform each variable so that it has a mean of 0 and variance of 1:

from sklearn.datasets import load_boston
from sklearn.feature_selection import SelectKBest, f_regression
from sklearn.preprocessing import StandardScaler

X, y = load_boston(return_X_y=True)

scaler_x = StandardScaler().fit(X)
X = scaler_x.transform(X)

X_new = SelectKBest(f_regression, k=2).fit_transform(X, y)

need normalization before SelectKBest in python

Tags:

python

feature-extraction

user3104352

1 Answers

killian95

Recent Activity

Donate For Us

need normalization before SelectKBest in python

Tags:

python

feature-extraction

user3104352

1 Answers

killian95

Related questions

Recent Activity

Donate For Us