I'm attempting to remove the rows of values in a list within df which are present in lst.
I'm aware of using df[df[x].isin(y)] for singular strings but am not sure as to how to adjust the same method to work with lists within a dataframe.
lst = ['f','a']
df:
Column1 Out1
0 ['x', 'y'] a
1 ['a', 'b'] i
2 ['c', 'd'] o
3 ['e', 'f'] u
etc.
I've attempted to use list comprehension but it doesn't seem to work the same with Pandas
df = df[[i for x in list for i in df['Column1']]]
Error:
TypeError: unhashable type: 'list'
My expected output would be as followed; removing the rows that contain the lists of which have the values in lst:
Column1 Out1
0 ['x', 'y'] a
1 ['c', 'd'] o
etc.
You can use convert values to sets and then use &, for inverting mask use ~:
df = pd.DataFrame({'Column1':[['x','y'], ['a','b'], ['c','d'],['e','f']],
'Out1':list('aiou')})
lst = ['f','a']
df1 = df[~(df['Column1'].apply(set) & set(lst))]
print (df1)
Column1 Out1
0 [x, y] a
2 [c, d] o
Solution with nested list comprehension - get list of booleans, so need all for check if all values are True:
df1 =df[[all([x not in lst for x in i]) for i in df['Column1']]]
print (df1)
Column1 Out1
0 [x, y] a
2 [c, d] o
print ([[x not in lst for x in i] for i in df['Column1']])
[[True, True], [False, True], [True, True], [True, False]]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With