I am a newbie to pandas so please forgive the newbie question!
I have the following code;
import pandas as pd
pet_names = ["Name","Species"
"Jack","Cat"
"Jill","Dog"
"Tom","Cat"
"Harry","Dog"
"Hannah","Dog"]
df = pd.DataFrame(pet_names)
df = df[df['Species']!='Cat']
print(df)
I would like to remove all the rows that contain "Cat" in the "Species" column, leaving all the dogs behind. How do I do this? Unfortunately, this code is currently returning errors.
boolean indexing
df[df['Species'] != 'Cat']
# df[df['Species'].ne('Cat')]
Index Name Species
1 1 Jill Dog
3 3 Harry Dog
4 4 Hannah Dog
df.query
df.query("Species != 'Cat'")
Index Name Species
1 1 Jill Dog
3 3 Harry Dog
4 4 Hannah Dog
For information on the pd.eval()
family of functions, their features and use cases, please visit Dynamic Expression Evaluation in pandas using pd.eval().
df.isin
df[~df['Species'].isin(['Cat'])]
Index Name Species
1 1 Jill Dog
3 3 Harry Dog
4 4 Hannah Dog
Your code df[df['Species']!='Cat']
is correct. It's your dataframe initialization code that is wrong. See the other comment by user cs95
.
While the other answer is correct, I prefer to use drop()
when deleting rows because it is more straightforward than using inverse logic (keeping rows that are not Cats). There's no difference for a simple example like this, but if you starting having more complex logic for which rows to drop, then it matters. For example, delete rows where A=1 AND (B=2 OR C=3)
.
Here's how you use drop()
with conditional logic:
df.drop( df.query(" `Species`=='Cat' ").index)
This is a more scalable syntax for more complicated logic.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With