I have the below dataframe
In [62]: df
Out[62]:
coverage name reports year
Cochice 45 Jason 4 2012
Pima 214 Molly 24 2012
Santa Cruz 212 Tina 31 2013
Maricopa 72 Jake 2 2014
Yuma 85 Amy 3 2014
Basically i can filter the rows as below
df[df["coverage"] > 30
and i can drop/delete a single row as below
df.drop(['Cochice', 'Pima'])
But i want to delete a certain number of rows based on a condition, how can i do so?
The best is boolean indexing
but need invert condition - get all values equal and higher as 72
:
print (df[df["coverage"] >= 72])
coverage name reports year
Pima 214 Molly 24 2012
Santa Cruz 212 Tina 31 2013
Maricopa 72 Jake 2 2014
Yuma 85 Amy 3 2014
It is same as ge
function:
print (df[df["coverage"].ge(72)])
coverage name reports year
Pima 214 Molly 24 2012
Santa Cruz 212 Tina 31 2013
Maricopa 72 Jake 2 2014
Yuma 85 Amy 3 2014
Another possible solution is invert mask by ~
:
print (df["coverage"] < 72)
Cochice True
Pima False
Santa Cruz False
Maricopa False
Yuma False
Name: coverage, dtype: bool
print (~(df["coverage"] < 72))
Cochice False
Pima True
Santa Cruz True
Maricopa True
Yuma True
Name: coverage, dtype: bool
print (df[~(df["coverage"] < 72)])
coverage name reports year
Pima 214 Molly 24 2012
Santa Cruz 212 Tina 31 2013
Maricopa 72 Jake 2 2014
Yuma 85 Amy 3 2014
we can use pandas.query() functionality as well
import pandas as pd
dict_ = {'coverage':[45,214,212,72,85], 'name': ['jason','Molly','Tina','Jake','Amy']}
df = pd.DataFrame(dict_)
print(df.query('coverage > 72'))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With