Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to scale all columns except last column?

I'm using python 3.7.6.

I'm working on classification problem.

I want to scale my data frame (df) features columns. The dataframe contains 56 columns (55 feature columns and the last column is the target column).

I want to scale the feature columns.

I'm doing it as follows:

y = df.iloc[:,-1]
target_name = df.columns[-1]
from FeatureScaling import feature_scaling
df = feature_scaling.scale(df.iloc[:,0:-1], standardize=False)
df[target_name] = y

but it seems not effective, because I need to recreate dataframe (add the target column to the scaling result).

Is there a way to scale just some columns without change the others, in effective way ? (i.e the result from scale will contain the scaled columns and one column which is not scale)

like image 201
user3668129 Avatar asked Oct 14 '25 11:10

user3668129


1 Answers

Using index of columns for scaling or other pre-processing operations is not a very good idea as every time you create a new feature it breaks the code. Rather use column names. e.g.

using scikit-learn:

from sklearn.preprocessing import StandardScaler, MinMaxScaler
features = [<featues to standardize>]
scalar = StandardScaler()
# the fit_transform ops returns a 2d numpy.array, we cast it to a pd.DataFrame
standardized_features = pd.DataFrame(scalar.fit_transform(df[features].copy()), columns = features)
old_shape = df.shape
# drop the unnormalized features from the dataframe
df.drop(features, axis = 1, inplace = True)
# join back the normalized features
df = pd.concat([df, standardized_features], axis= 1)
assert old_shape == df.shape, "something went wrong!"

or you can use a function like this if you don't prefer splitting and joining the data back.

import numpy as np
def normalize(x):
    if np.std(x) == 0:
        raise ValueError('Constant column')
    return (x -np.mean(x)) / np.std(x)

for col in features:
    df[col] = df[col].map(normalize)
like image 178
Bishwarup Bhattacharjee Avatar answered Oct 17 '25 01:10

Bishwarup Bhattacharjee