Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Ignore string columns while doing

I am using the following code to normalize a pandas DataFrame:

df_norm = (df - df.mean()) / (df.max() - df.min())

This works fine when all columns are numeric. However, now I have some string columns in df and the above normalization got errors. Is there a way to perform such normalization only on numeric columns of a data frame (keeping string column unchanged)?

like image 715
Edamame Avatar asked Jun 19 '17 20:06

Edamame


People also ask

How do you exclude columns from a data frame?

We can exclude one column from the pandas dataframe by using the loc function. This function removes the column based on the location. Here we will be using the loc() function with the given data frame to exclude columns with name,city, and cost in python.

How do I keep only certain columns in a data frame?

If you have a DataFrame and would like to access or select a specific few rows/columns from that DataFrame, you can use square brackets or other advanced methods such as loc and iloc .

How do I remove certain strings from a column?

To remove characters from columns in Pandas DataFrame, use the replace(~) method. Here, [ab] is regex and matches any character that is a or b .

How do you select all columns except some in pandas?

To select all columns except one column in Pandas DataFrame, we can use df. loc[:, df. columns != <column name>].


1 Answers

You can use select_dtypes to calculate value for the desired columns:

df = pd.DataFrame({'a': [1, 2, 3], 'b': ['a', 'b', 'c'], 'c': [4, 5, 6]})

df

   a  b  c
0  1  a  4
1  2  b  5
2  3  c  6

df_num = df.select_dtypes(include='number')

df_num

   a  c
0  1  4
1  2  5
2  3  6

And then you can assign them back to the original df:

df_norm = (df_num - df_num.mean()) / (df_num.max() - df_num.min())


df[df_norm.columns] = df_norm

df

     a  b    c
0 -0.5  a -0.5
1  0.0  b  0.0
2  0.5  c  0.5
like image 78
LateCoder Avatar answered Oct 04 '22 01:10

LateCoder