Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

USING LIKE inside pandas.query()

I have been using Pandas for more than 3 months and I have an fair idea about the dataframes accessing and querying etc.

I have got an requirement wherein I wanted to query the dataframe using LIKE keyword (LIKE similar to SQL) in pandas.query().

i.e: Am trying to execute pandas.query("column_name LIKE 'abc%'") command but its failing.

I know an alternative approach which is to use str.contains("abc%") but this doesn't meet our requirement.

We wanted to execute LIKE inside pandas.query(). How can I do so?

like image 293
Pradeep M Avatar asked Jul 13 '15 18:07

Pradeep M


2 Answers

If you have to use df.query(), the correct syntax is:

df.query('column_name.str.contains("abc")', engine='python') 

You can easily combine this with other conditions:

df.query('column_a.str.contains("abc") or column_b.str.contains("xyz") and column_c>100', engine='python') 

It is not a full equivalent of SQL Like, however, but can be useful nevertheless.

like image 65
volodymyr Avatar answered Oct 01 '22 17:10

volodymyr


@volodymyr is right, but the thing he forgets is that you need to set engine='python' to expression to work.

Example:

>>> pd_df.query('column_name.str.contains("abc")', engine='python') 

Here is more information on default engine ('numexpr') and 'python' engine. Also, have in mind that 'python' is slower on big data.

like image 39
P.Panayotov Avatar answered Oct 01 '22 16:10

P.Panayotov