Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Merge Pandas Dataframe based on boolean function

I am looking for an efficient way to merge two pandas data frames based on a function that takes as input columns from both data frames and returns True or False. E.g. Assume I have the following "tables":

import pandas as pd

df_1 = pd.DataFrame(data=[1, 2, 3])
df_2 = pd.DataFrame(data=[4, 5, 6])


def validation(a, b):
    return ((a + b) % 2) == 0

I would like to join df1 and df2 on each row where the sum of the first column is an even number. The resulting table would be

       1 5
df_3 = 2 4
       2 6
       3 5

Please think of it as a general problem not as a task to return just df_3. The solution should accept any function that validates a combination of columns and return True or False.

THX Lazloo

like image 532
Lazloo Xp Avatar asked Apr 15 '26 17:04

Lazloo Xp


1 Answers

You can do with merge on parity:

(df_1.assign(parity=df_1[0]%2)
     .merge(df_2.assign(parity=df_2[0]%2), on='dummy')
     .drop('parity', axis=1)
)

output:

   0_x  0_y
0    1    5
1    3    5
2    2    4
3    2    6
like image 87
Quang Hoang Avatar answered Apr 17 '26 07:04

Quang Hoang



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!