Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas: Split string on last occurrence

Tags:

python

pandas

I'm trying to split a column in a pandas dataframe based on a separator character, and obtain the last section.

pandas has the str.rsplit and the str.rpartition functions.

If I try:

df_client["Subject"].str.rsplit("-", 1)

I get

0 [Activity -Location , UserCode]
1 [Activity -Location , UserCode]

and if I try

df_client["Subject"].str.rpartition("-")

I get

      0            1      2   

0 Activity -Location - UserCode
1 Activity -Location - UserCode

If I do

df_client["Subject"].str.rpartition("-")[2]

I get

0 UserCode

which is what I want.

To me, str.rsplit seems unintuitive.

After getting the list of the split string, how would I then select the single item that I need?

like image 687
Alan Avatar asked Sep 02 '18 16:09

Alan


People also ask

How do you find the last occurrence of a string in Python?

The rfind() method finds the last occurrence of the specified value. The rfind() method returns -1 if the value is not found. The rfind() method is almost the same as the rindex() method.

How do you split a string on the right side in Python?

Python String rsplit() Method The rsplit() method splits a string into a list, starting from the right. If no "max" is specified, this method will return the same as the split() method. Note: When maxsplit is specified, the list will contain the specified number of elements plus one.

How do you split the pandas series?

split() function. The str. split() function is used to split strings around given separator/delimiter. The function splits the string in the Series/Index from the beginning, at the specified delimiter string.


1 Answers

I think need indexing by str working with iterables:

#select last lists 
df_client["Subject"].str.rsplit("-", 1).str[-1]
#select second lists
df_client["Subject"].str.rsplit("-", 1).str[1]

If performance is important use list comprehension:

df_client['last_col'] = [x.rsplit("-", 1)[-1] for x in df_client["Subject"]]
print (df_client)
                      Subject  last_col
0  Activity-Location-UserCode  UserCode
1  Activity-Location-UserCode  UserCode
like image 127
jezrael Avatar answered Sep 21 '22 19:09

jezrael