i have a two dataframes which consists of column
df has column: id1
id1
1
2
3
4
5
6
df2 has column: id2
id2
2
1
5
4
as you can see in df1 there are values which are not in df2['id2']
3,6
is there any way to find it by doing difference of two dataframe columns or any other way?
i tried it using
df2.isin(df1)
but only getting bool values.
but i want the actual rows
There are a number of ways you can solve this but Pandas index objects have a difference
method that finds all the indexes that are missing from the second index from the calling index.
idx1 = pd.Index(df.id1)
idx2 = pd.Index(df.id2)
idx1.difference(idx2).values
array([3, 6])
With isin
you will get the same result with this:
df[~df.id1.isin(df2.id2)]
You could also use set operations
list(set(df.id1) - set(df2.id2))
[3, 6]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With