Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to LEFT ANTI join under some matching condition

Tags:

I have two tables - one is a core data with a pair of IDs (PC1 and P2) and some blob data (P3). The other is a blacklist data for PC1 in the former table. I will call the first table in_df and the second blacklist_df.

What I want to do is to remove rows from in_df long as in_df.PC1 == blacklist_df.P1 and in_df.P2 == black_list_df.B1. Here is a code snippet to show what I want to achieve more explicitly.

in_df = sqlContext.createDataFrame([[1,2,'A'],[2,1,'B'],[3,1,'C'],  [4,11,'D'],[1,3,'D']],['PC1','P2','P3']) in_df.show()  +---+---+---+ |PC1| P2| P3| +---+---+---+ |  1|  2|  A| |  2|  1|  B| |  3|  1|  C| |  4| 11|  D| |  1|  3|  D| +---+---+---+  blacklist_df = sqlContext.createDataFrame([[1,2],[2,1]],['P1','B1']) blacklist_df.show()  +---+---+ | P1| B1| +---+---+ |  1|  2| |  2|  1| +---+---+ 

In the end what I want to get is the followings:

+---+--+--+ |PC1|P2|P3| +---+--+--+ |  1| 3| D| |  3| 1| C| |  4|11| D| +---+--+--+ 

I tried LEFT_ANTI join but I haven't been successful.

like image 451
kee Avatar asked Jul 14 '18 22:07

kee


People also ask

How does left anti join work?

There are two types of anti joins: A left anti join : This join returns rows in the left table that have no matching rows in the right table. A right anti join : This join returns rows in the right table that have no matching rows in the left table.

How do you use anti join in Python?

We can use the '~' operator on the semi-join. It results in anti-join. Semi-join: Similar to inner join, semi-join returns the intersection but it only returns the columns from the left table and not the right.

What are the different types of anti joins?

We'll walk through each of these situations later on, but first, here's a primer on the types of anti joins and how to write them. A left anti join : This join returns rows in the left table that have no matching rows in the right table. A right anti join : This join returns rows in the right table that have no matching rows in the left table.

What is a left anti join in SQL Server?

A left anti join : This join returns rows in the left table that have no matching rows in the right table. A right anti join : This join returns rows in the right table that have no matching rows in the left table. We will walk through how to use an anti join, using a left anti join.

What is not in syntax for left vs right anti semi join?

Includes right rows that do not match left rows. As you can see, there is no dedicated NOT IN syntax for left vs. right anti semi join - we achieve the effect simply by switching the table positions within SQL text. A Cartesian product of all left with all right rows.

Which condition is applied when using the left join?

The condition in the WHERE clauseis applied so that the statement only retrieves the data from the US, UK, and China rows. Because we use the LEFT JOIN clause, all rows that satisfy the condition in the WHERE clause of the countries table are included in the result set.


1 Answers

Pass the join conditions as a list to the join function, and specify how='left_anti' as the join type:

in_df.join(     blacklist_df,      [in_df.PC1 == blacklist_df.P1, in_df.P2 == blacklist_df.B1],      how='left_anti' ).show()  +---+---+---+ |PC1| P2| P3| +---+---+---+ |  1|  3|  D| |  4| 11|  D| |  3|  1|  C| +---+---+---+ 
like image 126
Psidom Avatar answered Sep 29 '22 12:09

Psidom