Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Normalising pandas data frame using StandardScaler() excluding a particular column

So I have a data frame which I formed by merging training (labeled) and test (unlabelled) data frames. And to un-append the test data frame I have kept a column which has a identifier if the row belonged to training or test. Now I have to normalize all the values in all the columns except for this one column "Sl No." but I am not finding any way to pass this one column. Here's what I was doing

import pandas as pd
import numpy as np
from sklearn.preprocessing import StandardScaler

data_norm = data_x_filled.copy() #Has training + test data frames combined to form single data frame
normalizer = StandardScaler()
data_array = normalizer.fit_transform(data_norm)
data_norm = pd.DataFrame(data_array,columns = data_norm.columns).set_index(data_norm.index)

I just want to exclude the column "Sl No." for normalization but want to retain it after normalization.

like image 365
Shivendra Avatar asked Mar 11 '23 06:03

Shivendra


1 Answers

try this it may work use numpy as np:

data_norm = data_x_filled.copy() #Has training + test data frames combined to form single data frame
normalizer = StandardScaler()
data_array = normalizer.fit_transform(data_norm.ix[:,data_norm.columns!='SI No'])
data_norm = pd.DataFrame(np.column_stack((data_norm['SI No'].values,data_array)),columns = data_norm.columns).set_index(data_norm.index)
like image 91
shivsn Avatar answered Mar 13 '23 19:03

shivsn