I have a set
remove_set
I want to remove all rows in a dataframe where a column value is in that set.
df = df[df.column_in_set not in remove_set]
This gives me the error:
'Series' objects are mutable, thus they cannot be hashed.
What is the most pandas/pythonic way to solve this problem? I could iterate through the rows and figure out the the ilocs to exclude, but that seems a little inelegant.
Some sample input and expected output.
Input:
column_in_set value_2 value_3
1 'a' 3
2 'b' 4
3 'c' 5
4 'd' 6
remove = set([2,4])
Output:
column_in_set value_2 value_3
1 'a' 3
3 'c' 5
To make the selection you can write:
df[~df['column_in_set'].isin(remove)]
isin()
simply checks if each value of the column/Series is in a set (or list or other iterable), returning a boolean Series.
In this case, we want to only include rows of the DataFrame which are not in remove
so we invert the boolean values with ~
and use then this to index the DataFrame.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With