I have two pandas df which look like this:
df1:
pid Name score age
100 Ram 3 36
101 Tony 2 40
101 Jack 4 56
200 Jill 6 30
df2
pid Name score age
100 Ram 3 36
101 Tony 2 40
101 John 4 51
101 Jack 9 32
200 Jill 6 30
Both df's are indexed with 'pid'. I would like to compare df1 & df2 based on the column 'score'. i.e, I need to keep only those rows in df2 that are matching with df1 on index and value of score.
My expected result should be
new df2:
pid Name index age
100 Ram 3 36
101 Tony 2 40
101 John 4 51
200 Jill 6 30
Any help on this regard is highly appreciated.
Use merge by columns pid and score, but first create columns from index by reset_index, last create pid index again and for same columns of new DataFrame add reindex by df2.columns:
df = (pd.merge(df1.reset_index(),
df2.reset_index(), on=['score', 'pid'], how='left', suffixes=['_',''])
.set_index('pid')
.reindex(columns=df2.columns))
print (df)
Name score age
pid
100 Ram 3 36
101 Tony 2 40
101 John 4 51
200 Jill 6 30
Inputs:
print (df1)
Name score age
pid
100 Ram 3 36
101 Tony 2 40
101 Jack 4 56
200 Jill 6 30
print (df2)
Name score age
pid
100 Ram 3 36
101 Tony 2 40
101 John 4 51
101 Jack 9 32
200 Jill 6 30
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With