Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Chaining string operations on Pandas Series

Tags:

python

pandas

I recently found out about the str method for Pandas series and it's great! However if I want to chain operations (say, a couple replace and a strip) I need to keep calling str after every operation, making it not the most elegant code.

For example, lets say my column names contain spaces and periods and I want to replace them by underscores. I might also want to strip any leftover underscores. If I wanted to do this using str methods, is there any way of avoiding having to run:

df.columns.str.replace(' ', '_').str.replace('.', '_').str.strip('_')

Thanks!

like image 501
tomasn4a Avatar asked Nov 16 '17 15:11

tomasn4a


People also ask

What is chaining in Pandas?

Pandas chaining is an alternative to variable assignment when transforming data. Those in favor of chaining argue that the code is easier to read because it lays out the execution of the transformation like a recipe.

How do I concatenate strings in Pandas?

Python | Pandas Series.str.cat() to concatenate string Pandas str.cat() is used to concatenate strings to the passed caller series of string. Distinct values from a different series can be passed but the length of both the series has to be same. .

Can Pandas series hold multiple data types?

Pandas Series is a one-dimensional labeled array capable of holding data of any type (integer, string, float, python objects, etc.).


Video Answer


2 Answers

I think need str repeat for each .str function, it is per design.


But here is possible use only one replace:

df = pd.DataFrame(columns=['aa dd', 'dd.d_', 'd._'])

print (df)
Empty DataFrame
Columns: [aa dd, dd.d_, d._]
Index: []

print (df.columns.str.replace('[\s+.]', '_').str.strip('_'))
Index(['aa_dd', 'dd_d', 'd'], dtype='object')
like image 85
jezrael Avatar answered Oct 16 '22 12:10

jezrael


Why not use a list comprehension?

import re
df.columns = [re.sub('[\s.]', '_', x).strip('_') for x in df.columns]

In a list comp, you're working with the string object directly, without the need to call .str each time.

like image 20
cs95 Avatar answered Oct 16 '22 13:10

cs95