Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Two columns match both elements of a list

I have a dataframe with two columns:

import pandas as pd
data={'A':['x','y','z','r','x','z'],'B':[1,2,3,4,1,7]}
df=pd.DataFrame(data)

That gets me:

A | B
x | 1
y | 2
z | 3
r | 4
x | 1
z | 7

Then a list with n lists of two elements:

list_of_lists=[['x',1],['x',4],['z',3],['y',1]]

I want to find out if the 1st element of every sub_list matches the column A and the second element matches the column B, getting something like:

A | B | Match
x | 1 | True
y | 2 | False
z | 3 | True
r | 4 | False
x | 1 | True
z | 7 | False

Thought creating two list for each element of the lists and do something like np.where with both conditions, but there must be a cleaner way.

like image 755
Alejandro A Avatar asked Dec 18 '22 14:12

Alejandro A


2 Answers

Use DataFrame.merge with helper DataFrame with left join and indicator=True parameter and then compare value both:

df1 = df.merge(pd.DataFrame(list_of_lists, columns=df.columns), how='left', indicator=True)

df['Match'] = df1['_merge'].eq('both')
print (df)
   A  B  Match
0  x  1   True
1  y  2  False
2  z  3   True
3  r  4  False
4  x  1   True
5  z  7  False
like image 93
jezrael Avatar answered Dec 30 '22 04:12

jezrael


You can try:

>>> df2 = pd.DataFrame(data = list_of_lists, columns = df.columns)
# less readable but slightly faster
# df2 = pd.DataFrame(dict(zip(['A','B'],zip(*list_of_lists))))
>>> df['Match'] = np.isin(df, df2).all(1)
>>> df
   A  B  Match
0  x  1   True
1  y  2  False
2  z  3   True
3  r  4  False
4  x  1   True
5  z  7  False
like image 27
Sayandip Dutta Avatar answered Dec 30 '22 04:12

Sayandip Dutta