Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas - equivalent of str.contains() in pandas query

Creating a dataframe using subsetting with below conditions

subset_df = df_eq.loc[(df_eq['place'].str.contains('Chile')) & (df_eq['mag'] > 7.5),['time','latitude','longitude','mag','place']]

Want to replicate the above subset using query() in Pandas.However not sure how to replicate str.contains() equivalent in Pandas query. "like" in query doesn't seem to work

query_df = df_eq[['time','latitude','longitude','mag','place']].query('place like \'%Chile\' and mag > 7.5')

place like '%Chile'and mag >7.5 
            ^
SyntaxError: invalid syntax

Any help will be appreciated

like image 849
raul Avatar asked Jul 29 '16 15:07

raul


People also ask

How do I check if a string contains a substring in Pandas?

Using “contains” to Find a Substring in a Pandas DataFrame The contains method returns boolean values for the Series with True for if the original Series value contains the substring and False if not. A basic application of contains should look like Series. str. contains("substring") .

How do you check if a series contains a value Pandas?

You can check if a column contains/exists a particular value (string/int), list of multiple values in pandas DataFrame by using pd. series() , in operator, pandas. series. isin() , str.

How do you check if a DataFrame column contains a string?

Method 1: Use isin() function In this scenario, the isin() function check the pandas column containing the string present in the list and return the column values when present, otherwise it will not select the dataframe columns.

How do you check if a series contains a string?

contains() function is used to test if pattern or regex is contained within a string of a Series or Index. The function returns boolean Series or Index based on whether a given pattern or regex is contained within a string of a Series or Index.


3 Answers

As of now I am able to do this by using the engine='python' argument of the .query method to use str.contains inside a query.

This should work:

query_df = df_eq[['time', 'latitude', 'longitude', 'mag', 'place']].query(
    "place.str.contains('Chile') and mag > 7.5", engine="python")
like image 117
petobens Avatar answered Oct 15 '22 01:10

petobens


What I think is going on here is that you are not able to utilize the method str.contains within the query pandas method. What you can do is create a mask and refer to that mask from within query using the at sign (@). Try this:

my_mask = df_eq["feature"].str.contains('my_word')
df_eq.query("@my_mask")
like image 27
Gustavo Vera Velasco Avatar answered Oct 15 '22 03:10

Gustavo Vera Velasco


Using str.contains works for me in pandas 1.0.0 with this syntax:

df.query("columnA == 'foo' and columnB.str.contains('bar')")
like image 22
eddygeek Avatar answered Oct 15 '22 03:10

eddygeek