df1 = pd.DataFrame({'A': [0, 0], 'B': [None, 4]})
df2 = pd.DataFrame({'A': [1, 1], 'B': [3, 0]})
df1.combine(df2, take_smaller, fill_value=-5)
The above code yields result. Where does the 4.0 come from?
From example in docs
take_smaller = lambda s1, s2: s1 if s1.sum() < s2.sum() else s2
This says if sum of a series in df1 is less than sum of the series in df2 , return series from df1 else from df2.
So when you do:
df1.combine(df2, take_smaller)
A B
0 0 3.0
1 0 0.0
This works fine.
However when you do a fill_value=-5
, then the sum of second series in the first dataframe becomes smaller since fill_value
first fills NaN and then compares. (-5+4) < (3+0) , hence -5 and 4
is returned.
fill_value scalar value, default None The value to fill NaNs with prior to passing any column to the merge func.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With