Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pandas left join where right is null on multiple columns

I have two pandas df x and y, both with the same 3 columns A B C (not nullable). I need to create a new df z, obtained by "subtracting from x the rows which are entirely identical to the rows of y", i.e. a

x left join y on x.A=y.A and x.B=y.B and x.C=y.C
where y.A is null

How would I do that? Got stuck with indexes, concat, merge, join, ...

Example:

dataframe x
A    B    C
q1   q2   q3
q4   q2   q3
q7   q2   q9

dataframe y
A    B    C
q4   q2   q3

dataframe z
A    B    C
q1   q2   q3
q7   q2   q9
like image 916
edoedoedo Avatar asked Mar 26 '18 08:03

edoedoedo


People also ask

What is the output of left Joining below 2 tables in pandas?

A left join, or left merge, keeps every row from the left dataframe. Result from left-join or left-merge of two dataframes in Pandas. Rows in the left dataframe that have no corresponding join value in the right dataframe are left with NaN values.

Can you join on two columns pandas?

To merge two pandas DataFrames on multiple columns use pandas. merge() method. merge() is considered more versatile and flexible and we also have the same method in DataFrame.

What is left on and right on in pandas merge?

left: use only keys from left frame, similar to a SQL left outer join; preserve key order. right: use only keys from right frame, similar to a SQL right outer join; preserve key order. outer: use union of keys from both frames, similar to a SQL full outer join; sort keys lexicographically.


1 Answers

I think need merge with indicator and filter only rows from left DataFrame:

df = x.merge(y, indicator='i', how='outer').query('i == "left_only"').drop('i', 1)
print (df)
    A   B    C
0  q1  q2   q3
2  q7  q2  q93
like image 149
jezrael Avatar answered Sep 27 '22 19:09

jezrael