Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas merge indicator custom value

Tags:

pandas

What is the fastest way to update the indicator to a more friendly message during a pandas merge? The default indicator= True yields left_only,right_only,both, I want to update it to Only present in last month's data,Only present in current month's data, Present in Both month's data.

I am hoping to do it without a lambda operator.

like image 688
Victor Avatar asked Sep 17 '25 15:09

Victor


1 Answers

Creating a working example:

np.random.seed(0)
left = pd.DataFrame({'key': ['A', 'B', 'C', 'D'], 'value': np.random.randn(4)})    
right = pd.DataFrame({'key': ['B', 'D', 'E', 'F'], 'value': np.random.randn(4)})

merged=left.merge(right,on='key',how='outer',indicator=True)
print(merged)

  key   value_x   value_y      _merge
0   A  1.764052       NaN   left_only
1   B  0.400157  1.867558        both
2   C  0.978738       NaN   left_only
3   D  2.240893 -0.977278        both
4   E       NaN  0.950088  right_only
5   F       NaN -0.151357  right_only

For mapping the values:

d={"left_only":"Only present in last month's data", "right_only":"Only present in current month's data","both":"Present in Both month's data"}

merged['_merge'] = merged['_merge'].map(d)
print(merged)

  key   value_x   value_y                                _merge
0   A  1.764052       NaN     Only present in last month's data
1   B  0.400157  1.867558          Present in Both month's data
2   C  0.978738       NaN     Only present in last month's data
3   D  2.240893 -0.977278          Present in Both month's data
4   E       NaN  0.950088  Only present in current month's data
5   F       NaN -0.151357  Only present in current month's data
like image 93
anky Avatar answered Sep 19 '25 08:09

anky