I am using the following code to normalize a pandas DataFrame: <pre class="prettyprint"><code>df_norm = (df - df.mean()) / (df.max() - df.min()) </code></pre> This works fine when all columns are numeric. However, now I have some string columns in <code>df</code> and the above normalization got errors. Is there a way to perform such normalization only on numeric columns of a data frame (keeping string column unchanged)?

You can use <code>select_dtypes</code> to calculate value for the desired columns: <pre class="prettyprint"><code>df = pd.DataFrame({'a': [1, 2, 3], 'b': ['a', 'b', 'c'], 'c': [4, 5, 6]}) df a b c 0 1 a 4 1 2 b 5 2 3 c 6 df_num = df.select_dtypes(include='number') df_num a c 0 1 4 1 2 5 2 3 6 </code></pre> And then you can assign them back to the original <code>df</code>: <pre class="prettyprint"><code>df_norm = (df_num - df_num.mean()) / (df_num.max() - df_num.min()) df[df_norm.columns] = df_norm df a b c 0 -0.5 a -0.5 1 0.0 b 0.0 2 0.5 c 0.5 </code></pre>

Ignore string columns while doing

Tags:

python

python-3.x

pandas

normalization

I am using the following code to normalize a pandas DataFrame:

df_norm = (df - df.mean()) / (df.max() - df.min())

This works fine when all columns are numeric. However, now I have some string columns in df and the above normalization got errors. Is there a way to perform such normalization only on numeric columns of a data frame (keeping string column unchanged)?

715

asked Jun 19 '17 20:06

Edamame

1 Answers

You can use select_dtypes to calculate value for the desired columns:

df = pd.DataFrame({'a': [1, 2, 3], 'b': ['a', 'b', 'c'], 'c': [4, 5, 6]})

df

   a  b  c
0  1  a  4
1  2  b  5
2  3  c  6

df_num = df.select_dtypes(include='number')

df_num

   a  c
0  1  4
1  2  5
2  3  6

And then you can assign them back to the original df:

df_norm = (df_num - df_num.mean()) / (df_num.max() - df_num.min())


df[df_norm.columns] = df_norm

df

     a  b    c
0 -0.5  a -0.5
1  0.0  b  0.0
2  0.5  c  0.5

answered Oct 04 '22 01:10

LateCoder

Related questions
                            
                                Python 3: No module named 'sklearn.model_selection'
                            
                                Multiple thermocouples on raspberry pi
                            
                                Python alternative for calculating pairwise distance between two sets of 2d points [duplicate]
                            
                                SQLAlchemy create dynamic tables and columns
                            
                                Python 3.6 import requests
                            
                                How to join 3 tables in query with Django
                            
                                Loading local file from client onto dask distributed cluster
                            
                                Python - replace unicode emojis with ASCII characters
                            
                                Python MemoryError on large array
                            
                                How to get query parameters from Django Channels?
                            
                                Get file size from Google Drive Api (python)
                            
                                OpenCv imwrite doesn't work because of special character in file path
                            
                                Detect log file rotation (while watching log file for modification)
                            
                                TypeError: Value passed to parameter 'a' has DataType not in list of allowed values
                            
                                Count of each unique element in a list [duplicate]
                            
                                NotImplementedError: Use module Crypto.Cipher.PKCS1_OAEP instead error
                            
                                Printing one color using imshow [closed]
                            
                                Pause and resume thread in python
                            
                                Python3: Reportlab Image - ResourceWarning: unclosed file <_io.BufferedReader name=...>
                            
                                Opencv-python: Type of input image should be CV_8UC3 or CV_8UC4! in function fastNlMeansDenoisingColored

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With