Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pandas function with isin

Tags:

python

pandas

I have a dataframe as like this:

aa        bb  cc
[a, x, y] a   1
[b, d, z] b   2
[c, e, f] s   3
np.nan    d   4

I'm trying to create a new column like this:

aa        bb  cc dd
[a, x, y] a   1  True
[b, d, z] b   2  True
[c, e, f] s   3  False
np.nan    d   4  False

My current solution is:

def some_function(row):
    if row['bb].isin(row['aa'])==True:
        return True
    return False
df['dd'] = df.apply(lambda row: some_function(row), axis=1)

But this throws out an error ("'str' object has no attribute 'isin'", 'occurred at index 0')

I suspect, because I'm missing something when it comes to checking the isin.

Essentially, I need to check if the str value of bb is in column aa which has a list in each cell.

Any ideas on how to do this?

like image 836
Kvothe Avatar asked Oct 18 '17 09:10

Kvothe


People also ask

How do I use ISIN in pandas?

Pandas isin() method is used to filter data frames. isin() method helps in selecting rows with having a particular(or Multiple) value in a particular column. Parameters: values: iterable, Series, List, Tuple, DataFrame or dictionary to check in the caller Series/Data Frame.

How do you check if a value exists in a pandas series?

You can check if a column contains/exists a particular value (string/int), list of multiple values in pandas DataFrame by using pd. series() , in operator, pandas. series. isin() , str.


1 Answers

You need parameter in for check membership in list:

df['dd'] = df.apply(lambda x: x.bb in x.aa, axis=1)
print (df)
          aa bb  cc     dd
0  [a, x, y]  a   1   True
1  [b, d, z]  b   2   True
2  [c, e, f]  s   3  False

EDIT:

df['dd'] = df.apply(lambda x: (x.bb in x.aa) and (x.cc == 1), axis=1) 
print (df)
          aa bb  cc     dd
0  [a, x, y]  a   1   True
1  [b, d, z]  b   2  False
2  [c, e, f]  s   3  False

Or:

df['dd'] = df.apply(lambda x: x.bb in x.aa, axis=1) & (df['cc'] == 1)
print (df)
          aa bb  cc     dd
0  [a, x, y]  a   1   True
1  [b, d, z]  b   2  False
2  [c, e, f]  s   3  False

EDIT:

df['dd'] = df.apply(lambda x: x.bb in x.aa if type(x.aa) == list else False, axis=1) 
print (df)
          aa bb  cc     dd
0  [a, x, y]  a   1   True
1  [b, d, z]  b   2   True
2  [c, e, f]  s   3  False
4        NaN  d   4  False
like image 100
jezrael Avatar answered Oct 05 '22 23:10

jezrael