I'm trying to filter out certain rows in my dataframe that is allowing two combinations of values for two columns. For example columns 'A' and 'B' can just be either 'A' > 0 and 'B' > 0 OR 'A' < 0 and 'B' < 0. Any other combination I want to filter.
I tried the following
df = df.loc[(df['A'] > 0 & df['B'] > 0) or (df['A'] < 0 & df['B'] < 0)]
which gives me an error: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
I know this is probably a very trivial questions but I couldn't find any solution to be honest and I can't figure out what the problem with my approach ist.
Use pandas DataFrame. iloc[] & DataFrame. loc[] to select rows by integer Index and by row indices respectively. iloc[] operator can accept single index, multiple indexes from the list, indexes by a range, and many more.
loc selects rows based on a labeled index. So, if you want to select the row with an index label of 5, you would directly use df. loc[[5]].
You can select rows from a list index using index. isin() Method which is used to check each element in the DataFrame is contained in values or not.
In the Pandas DataFrame we can find the specified row value with the using function iloc(). In this function we pass the row number as parameter.
You need some parenthesis and to format for pandas (and/or to become &/|):
df = df[((df['A'] > 0) & (df['B'] > 0)) | ((df['A'] < 0) & (df['B'] < 0))]
Keep in mind what this is doing - you're just building a giant list of [True, False, True, True] and passing that into the df index, telling it to keep each row depending on whether it gets a True or a False in the corresponding list.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With