How can I normalize the data in a range of columns in my pandas dataframe

Tags:

python

pandas

Suppose I have a pandas data frame surveyData:

I want to normalize the data in each column by performing:

surveyData_norm = (surveyData - surveyData.mean()) / (surveyData.max() - surveyData.min())

This would work fine if my data table only contained the columns I wanted to normalize. However, I have some columns containing string data preceding like:

Name  State  Gender  Age  Income  Height Sam   CA     M        13   10000    70 Bob   AZ     M        21   25000    55 Tom   FL     M        30   100000   45

I only want to normalize the Age, Income, and Height columns but my above method does not work becuase of the string data in the name state and gender columns.

983

asked Feb 18 '15 05:02

Jeremy

2 Answers

You can perform operations on a sub set of rows or columns in pandas in a number of ways. One useful way is indexing:

# Assuming same lines from your example cols_to_norm = ['Age','Height'] survey_data[cols_to_norm] = survey_data[cols_to_norm].apply(lambda x: (x - x.min()) / (x.max() - x.min()))

This will apply it to only the columns you desire and assign the result back to those columns. Alternatively you could set them to new, normalized columns and keep the originals if you want.

answered Sep 19 '22 09:09

cwharland

I think it's better to use 'sklearn.preprocessing' in this case which can give us much more scaling options. The way of doing that in your case when using StandardScaler would be:

from sklearn.preprocessing import StandardScaler cols_to_norm = ['Age','Height'] surveyData[cols_to_norm] = StandardScaler().fit_transform(surveyData[cols_to_norm])

answered Sep 21 '22 09:09

Yaron

Related questions
                            
                                Consuming a kinesis stream in python
                            
                                Google API: getting Credentials from refresh token with oauth2client.client
                            
                                How to set same color for markers and lines in a matplotlib plot loop?
                            
                                What does NN VBD IN DT NNS RB means in NLTK?
                            
                                Why are some variables and comments in my IPython notebook red?
                            
                                pandas rounding when converting float to integer
                            
                                How to apply LabelEncoder for a specific column in Pandas dataframe
                            
                                How to check similarity of two images that have different pixelization
                            
                                FFT for Spectrograms in Python
                            
                                How to implement a pythonic equivalent of tail -F?
                            
                                Can SQLAlchemy DateTime Objects Only Be Naive?
                            
                                Are there builtin functions for elementwise boolean operators over boolean lists?
                            
                                Recommended NoSQL Database for use with Python [closed]
                            
                                Overriding special methods on an instance
                            
                                Combine Python Dictionary Permutations into List of Dictionaries
                            
                                Python pandas: select columns with all zero entries in dataframe
                            
                                How to create HTTPS tornado server
                            
                                Using "and" and "or" operator with Python strings [duplicate]
                            
                                NumPy - What is the difference between frombuffer and fromstring?
                            
                                Yield from coroutine vs yield from task

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With