Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas: select rows if keyword appears in any column

I know there is a relevant thread about searching for a string in one column (here) but how does one use pd.Series.str.contains(pattern) across all columns?

df = pd.DataFrame({'vals': [1, 2, 3, 4], 'ids': [u'aball', u'bball', u'cnut', u'fball'],
'id2': [u'uball', u'mball', u'pnut', u'zball']})


In [3]: df[df['ids'].str.contains("ball")]
Out[3]:
     ids  vals
0  aball     1
1  bball     2
3  fball     4
like image 271
Cibic Avatar asked Jun 26 '18 14:06

Cibic


1 Answers

stack

If you select just the things that might have 'ball' which are columns that are of dtype object, then you can stack the resulting dataframe into a series object. At that point you can perform pandas.Series.str.contains and unstack the results back into a dataframe.

df.select_dtypes(include=[object]).stack().str.contains('ball').unstack()

     ids    id2
0   True   True
1   True   True
2  False  False
3   True   True
like image 128
piRSquared Avatar answered Nov 05 '22 16:11

piRSquared