I have this large dataframe I've imported into pandas and I want to chop it down via a filter. Here is my basic sample code:
import pandas as pd
import numpy as np
from pandas import Series, DataFrame
df = DataFrame({'A':[12345,0,3005,0,0,16455,16454,10694,3005],'B':[0,0,0,1,2,4,3,5,6]})
df2= df[df["A"].map(lambda x: x > 0) & (df["B"] > 0)]
Basically this displays bottom 4 results which is semi-correct. But I need to display everything BUT these results. So essentially, I'm looking for a way to use this filter but in a "not" version if that's possible. So if column A is greater than 0 AND column B is greater than 0 then we want to disqualify these values from the dataframe. Thanks
Method 1: Use NOT IN Filter with One Column We are using isin() operator to get the given values in the dataframe and those values are taken from the list, so we are filtering the dataframe one column values which are present in that list.
You can use df[df["Courses"] == 'Spark'] to filter rows by a condition in pandas DataFrame. Not that this expression returns a new DataFrame with selected rows.
No need for map function call on Series "A".
Apply De Morgan's Law:
"not (A and B)" is the same as "(not A) or (not B)"
df2 = df[~(df.A > 0) | ~(df.B > 0)]
There is no need for the map
implementation. You can just reverse the arguments like ...
df.ix[(df.A<=0)|(df.B<=0),:]
Or use boolean indexing
without ix
:
df[(df.A<=0)|(df.B<=0)]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With