Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas Dataframe Check if column value is in column list

I have a dataframe df:

data = {'id':[12,112],
        'idlist':[[1,5,7,12,112],[5,7,12,111,113]]
       }
df=pd.DataFrame.from_dict(data)

which looks like this:

    id                idlist
0   12    [1, 5, 7, 12, 112]
1  112  [5, 7, 12, 111, 113]

I need to check and see if id is in the idlist, and select or flag it. I have tried variations of the following and receive the commented error:

df=df.loc[df.id.isin(df.idlist),:] #TypeError: unhashable type: 'list'
df['flag']=df.where(df.idlist.isin(df.idlist),1,0) #TypeError: unhashable type: 'list'

Some possible other methods to a solution would be .apply in a list comprehension?

I am looking for a solution here that either selects the rows where id is in idlist, or flags the row with a 1 where id is in idlist. The resulting df should be either:

   id              idlist
0  12  [1, 5, 7, 12, 112]

or:

   flag   id                idlist
0     1   12    [1, 5, 7, 12, 112]
1     0  112  [5, 7, 12, 111, 113]

Thanks for the help!

like image 885
clg4 Avatar asked Nov 27 '17 14:11

clg4


People also ask

How do you check if a column value is in a list pandas?

You can check if a column contains/exists a particular value (string/int), list of multiple values in pandas DataFrame by using pd. series() , in operator, pandas. series. isin() , str.


1 Answers

Use apply:

df['flag'] = df.apply(lambda x: int(x['id'] in x['idlist']), axis=1)
print (df)
    id                idlist  flag
0   12    [1, 5, 7, 12, 112]     1
1  112  [5, 7, 12, 111, 113]     0

Similar:

df['flag'] = df.apply(lambda x: x['id'] in x['idlist'], axis=1).astype(int)
print (df)
    id                idlist  flag
0   12    [1, 5, 7, 12, 112]     1
1  112  [5, 7, 12, 111, 113]     0

With list comprehension:

df['flag'] = [int(x[0] in x[1]) for x in df[['id', 'idlist']].values.tolist()]
print (df)
    id                idlist  flag
0   12    [1, 5, 7, 12, 112]     1
1  112  [5, 7, 12, 111, 113]     0

Solutions for filtering:

df = df[df.apply(lambda x: x['id'] in x['idlist'], axis=1)]
print (df)
   id              idlist
0  12  [1, 5, 7, 12, 112]

df = df[[x[0] in x[1] for x in df[['id', 'idlist']].values.tolist()]]
print (df)

   id              idlist
0  12  [1, 5, 7, 12, 112]
like image 56
jezrael Avatar answered Oct 16 '22 07:10

jezrael