Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas: how to apply function to column names

Tags:

python

pandas

I would like all columns to be named in a uniform manner, like:

Last Name -> LAST_NAME
e-mail -> E_MAIL
ZIP code 2 -> ZIP_CODE_2

For that purpose I wrote a function that uppercases all symbols, keeps digits and replaces rest of the characters with underscore ('_'). Then it replaces multiple underscores with just one and trims underscores at both ends.

How do I apply this function (lambda) to the column names in Pandas?

like image 523
Denis Kulagin Avatar asked Mar 14 '17 14:03

Denis Kulagin


2 Answers

You can do this without using apply by calling the vectorised str methods:

In [62]:
df = pd.DataFrame(columns=['Last Name','e-mail','ZIP code 2'])
df.columns

Out[62]:
Index(['Last Name', 'e-mail', 'ZIP code 2'], dtype='object')

In [63]:    
df.columns = df.columns.str.upper().str.replace(' ','_')
df.columns    

Out[63]:
Index(['LAST_NAME', 'E-MAIL', 'ZIP_CODE_2'], dtype='object')

Otherwise you can convert the Index object to a Series using to_series so you can use apply:

In [67]:
def func(x):
    return x.upper().replace(' ','_')
df.columns = df.columns.to_series().apply(func)
df

Out[67]:
Empty DataFrame
Columns: [LAST_NAME, E-MAIL, ZIP_CODE_2]
Index: []

Thanks to @PaulH for suggesting using rename with a lambda:

In [68]:
df.rename(columns=lambda c: c.upper().replace(' ','_'), inplace=True)
df.columns

Out[68]:
Index(['LAST_NAME', 'E-MAIL', 'ZIP_CODE_2'], dtype='object')
like image 148
EdChum Avatar answered Oct 20 '22 00:10

EdChum


You can simply set the .columns property of the data frame. So in order to rename it, you can use:

df.columns = list(map(yourlambda,df.columns))

Where you of course replace yourlambda with your lambda expression.

like image 29
Willem Van Onsem Avatar answered Oct 20 '22 00:10

Willem Van Onsem