I have a DataFrame
. 1 column (name
) has string values. I was wondering if there was a way to select rows based on a partial string match against a particular column, using the DataFrame.query()
method.
I tried:
df.query('name.str.contains("lu")')
. Error message: "TypeError: 'Series' objects are mutable, thus they cannot be hashed"df.query('"lu" in name')
. Returns an empty DataFrame
.The code I use:
import pandas as pd
df = pd.DataFrame({
'name':['blue','red','blue'],
'X1':[96.32,96.01,96.05]
}, columns=['name','X1'])
print(df.query('"lu" in name').head())
print(df.query('name.str.contains("lu")').head())
I know I could use df[df['name'].str.contains("lu")]
but I prefer to use
query.
Using iterrows() to iterate rows with find to get rows that contain the desired text. iterrows() function returns the iterator yielding each index value along with a series containing the data in each row.
Slicing Rows and Columns by Index PositionWhen slicing by index position in Pandas, the start index is included in the output, but the stop index is one step beyond the row you want to select. So the slice return row 0 and row 1, but does not return row 2. The second slice [:] indicates that all columns are required.
Using “contains” to Find a Substring in a Pandas DataFrame The contains method in Pandas allows you to search a column for a specific substring. The contains method returns boolean values for the Series with True for if the original Series value contains the substring and False if not.
The issue that @ayhan refers to now shows how this can be achieved by using query
's python engine:
print(df.query('name.str.contains("lu")', engine='python').head())
should work.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With