Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Remove element from list in pandas dataframe based on value in column

Tags:

python

pandas

Let's say I have following dataframe:

a = [[1,2,3,4,5,6],[23,23,212,223,1,12]]
b = [1,1]


df = pd.DataFrame(zip(a,b), columns = ['a', 'b'])

And my goal is to remove the elements in the lists in series A that are in series B. My attempt at doing so is below:

df['a'] = [i.remove(j) for i,j in zip(df.a, df.b)]

The logic seems sounds to me however I'm ending up with df['a'] being a series of nulls. What is going on here?

like image 682
ben890 Avatar asked Feb 04 '23 16:02

ben890


1 Answers

Here's an alternative way of doing it:

In []:
df2 = df.explode('a')
df['a'] = df2.a[df2.a != df2.b].groupby(level=0).apply(list)
df

Out[]:
                        a  b
0         [2, 3, 4, 5, 6]  1
1  [23, 23, 212, 223, 12]  1
like image 159
AChampion Avatar answered Feb 06 '23 14:02

AChampion