Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to use isin while ignoring index

Tags:

python

pandas

I'm trying to check if rows exist in another dataframe. I'm not joining/merging because of issues with it creating duplication, and then needing to filter out that duplication might also filter out actual duplication that I want to keep.

example:

table1 = pd.DataFrame({'a':[1, 2, 5, 3, 4],
              'b':['a', 'b', 'e', 'c', 'd']})
table2 = pd.DataFrame({'a':[1, 4, 3, 6, 2],
              'b':['a', 'd', 'c', 'f', 'b']})


table1.isin(table2)

       a      b
0   True   True
1  False  False
2  False  False
3  False  False
4  False  False

I would like all of these to be True except at index 2 where row 5 e doesn't exist in table2.

like image 999
Matt W. Avatar asked Jan 02 '23 07:01

Matt W.


2 Answers

IIUC

table1.stack().isin(table2.stack().values).unstack()
Out[207]: 
       a      b
0   True   True
1   True   True
2  False  False
3   True   True
4   True   True

If check the row bases

table1.astype(str).sum(1).isin(table2.astype(str).sum(1))

By using merge

table1.merge(table2.assign(vec=True),how='left').fillna(False)
Out[232]: 
   a  b    vec
0  1  a   True
1  2  b   True
2  5  e  False
3  3  c   True
4  4  d   True
like image 63
BENY Avatar answered Jan 18 '23 04:01

BENY


If need compare each value separately convert table2 to 1d array:

a = table1.isin(table2.values.ravel())
print (a)
       a      b
0   True   True
1   True   True
2  False  False
3   True   True
4   True   True

If need compare each row separately:

a = (table1.apply(tuple, 1).isin(table2.apply(tuple, 1)))

Or:

a = (table1.astype(str).apply('###'.join, 1).isin(table2.astype(str).apply('###'.join, 1).))


print (a)
0     True
1     True
2    False
3     True
4     True
dtype: bool

For better explanation input data are changed:

table1 = pd.DataFrame({'a':[1, 2, 5, 3, 4],
              'b':['d', 'b', 'e', 'c', 'd']})
table2 = pd.DataFrame({'a':[1, 4, 3, 6, 2],
              'b':['a', 'd', 'c', 'f', 'b']})

print (table1)
   a  b
0  1  d -> changed to d
1  2  b
2  5  e
3  3  c
4  4  d

print (table2)
   a  b
0  1  a
1  4  d
2  3  c
3  6  f
4  2  b

a = table1.isin(table2.values.ravel())
print (a)
       a      b
0   True   True  d exist in table2, so True
1   True   True
2  False  False
3   True   True
4   True   True

a = (table1.apply(tuple, 1).isin(table2.apply(tuple, 1)))
print (a)
0    False -> comparing 1-a with 1-b return False
1     True
2    False
3     True
4     True
dtype: bool
like image 21
jezrael Avatar answered Jan 18 '23 04:01

jezrael