Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to search a partial String in the whole dataframe using pandas?

Tags:

pandas

How to search for a string value on each and every columns using pandas . Lets say I have 32 columns ,

df[df['A'].str.contains("hello")]

this returns whether the value is present in "A" column or not ,How to search on every columns and the row in which the value is exist . Dataset :

A           B           C
1           hi          hie
2           bye         Hello

If I search for "hello" or "Hello" output should be :

A           B            C
2           bye         Hello
like image 754
Sidhartha Avatar asked May 29 '17 07:05

Sidhartha


People also ask

How do I search for a specific string in pandas?

Using “contains” to Find a Substring in a Pandas DataFrame The contains method in Pandas allows you to search a column for a specific substring. The contains method returns boolean values for the Series with True for if the original Series value contains the substring and False if not.


2 Answers

I think you can use:

df = pd.DataFrame({'A':['hello fgf','s','f'],'B':['d','ff hello','f'],'C':[4,7,8]})
print (df)
           A         B  C
0  hello fgf         d  4
1          s  ff hello  7
2          f         f  8

mask = df.applymap(lambda x: 'hello' in str(x))
print (mask)
       A      B      C
0   True  False  False
1  False   True  False
2  False  False  False

Then if need filter add any for check at least one True per row with boolean indexing:

df1 = df[mask.any(axis=1)]
print (df1)
           A         B  C
0  hello fgf         d  4
1          s  ff hello  7

EDIT:

tested = 'hello'
mask = df.applymap(lambda x:  tested.lower() in str(x).lower())
print (mask)
       A      B      C
0  False  False  False
1  False  False   True
like image 51
jezrael Avatar answered Sep 18 '22 14:09

jezrael


You can also concatenate all columns into one string and search for your substring in the concatenated string:

In [21]: df[df.astype(str).add('|').sum(1).str.contains('hello')]
Out[21]:
           A         B  C
0  hello fgf         d  4
1          s  ff hello  7

Explanation:

In [22]: df.astype(str).add('|').sum(1)
Out[22]:
0    hello fgf|d|4|
1     s|ff hello|7|
2            f|f|8|
dtype: object
like image 37
MaxU - stop WAR against UA Avatar answered Sep 19 '22 14:09

MaxU - stop WAR against UA