Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to return match with a string that contains parentheses in pandas?

Tags:

python

pandas

I have part of my code extracting an element from a column Ranks by matching a string name with elements in another column Names:

rank = df.loc[df['Names'].str.contains(name), 'Ranks'].iloc[0]

The code is working as intended except for some few cases when name contains parentheses.

For example, it will cause an error for name = Banana (1998).

I understand that str.contains might not be the best method here, but I have looked around and don't seem to have found any other post asking about the same problem so I can work my way from there.

A sample of the df can be reproduced with:

data = [['Apple', 10], ['Banana (1998)', 15], ['Banana (2000)', 14]] df = pd.DataFrame(data, columns = ['Names', 'Ranks'])

like image 993
Sd Junk Avatar asked Mar 03 '23 07:03

Sd Junk


1 Answers

If you use str.contains, you need to escape '(' and ')' in name because they are special chars in regex as follows

name = 'Banana \(1998\)'
df['Names'].str.contains(name)

Out[655]:
0    False
1     True
2    False
Name: Names, dtype: bool

df.loc[df['Names'].str.contains(name), 'Ranks']

Out[659]:
1    15
Name: Ranks, dtype: int64
like image 172
Andy L. Avatar answered Apr 08 '23 13:04

Andy L.