Compare every element in two different-size dataframes and get added or deleted word in pandas

Question

I have a question for pandas dataframe operation

Suppose I have two different sized dataframe( they have same row count but don't have same size of columns

a =pd.DataFrame({"code1":['A','B','C','D'],"code2":['E','F','G','H']})
b= pd.DataFrame({"code1":['A1','B','C','D'],"code2":['E','F','G','N'],"code3":['A2','L','M','']})

For visualization:

a: code1 code2
0     A     E
1     B     F
2     C     G
3     D     H
b: code1 code2 code3
0    A1     E     A2 
1     B     F     L
2     C     G     M
3     D     N

My ideal output is to have a dataframe 'c' saying that:

c: addedword  deletedword
0   A1,A2      A
1   L
2   M
3   N          H

Basically, I want to compare every row in 'a' with corresponding row in 'b'. And then compare every element so that if there is added string or deleted string, then display to a new dataframe.

piRSquared · Accepted Answer

Use set differences

g = lambda x: map(set, x.values)          # converts 2-D array to sets
f = lambda t: (t[1] - t[0], t[0] - t[1])  # t will be a tuple of sets
h = lambda y: map(','.join, y)            # stitch sets back together
pd.DataFrame(
    list(map(h, map(f, zip(*map(g, (a, b)))))),
    columns=['Added', 'Deleted']
)

   Added Deleted
0  A1,A2       A
1      L        
2      M        
3     ,N       H

Compare every element in two different-size dataframes and get added or deleted word in pandas

Tags:

python

string

pandas

dataframe

Yixian Wang

1 Answers

piRSquared

Recent Activity

Donate For Us

Compare every element in two different-size dataframes and get added or deleted word in pandas

Tags:

python

string

pandas

dataframe

Yixian Wang

1 Answers

piRSquared

Related questions

Recent Activity

Donate For Us