Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to do outer join on two columns in Pig Latin

Tags:

I do outer joins on single columns in Pig like this

result = JOIN A by id LEFT OUTER, B by id; 

How do I join on two columns, something like -

WHERE A.id=B.id AND A.name=B.name 

What is the pig equivalent? I couldn't find any example in the pig manuals...any help?

like image 402
hese Avatar asked Nov 07 '11 15:11

hese


People also ask

How do you JOIN two fields in a pig?

Here is how you can perform a JOIN operation on two tables using multiple keys. grunt> Relation3_name = JOIN Relation2_name BY (key1, key2), Relation3_name BY (key1, key2);

What are Joins how many types of joins are there in Pig Latin?

Inner join is one of the most frequently used join. Inner join returns the common rows between the two tables based on the condition implied. Inner join is also called as equi join.

How do you create a relationship with a pig?

Now, you can use 'C' as the 'empty relation' that has one empty tuple. Show activity on this post. DEFINE GenerateRelationFromString(string) RETURNS relation { temp = LOAD 'somefile'; tempLimit1 = LIMIT temp 1; $relation = FOREACH tempLimit1 GENERATE FLATTEN(TOKENIZE('$string', ',')); };

What is flatten in pig?

As per Pig documentation: The FLATTEN operator looks like a UDF syntactically, but it is actually an operator that changes the structure of tuples and bags in a way that a UDF cannot. Flatten un-nests tuples as well as bags. The idea is the same, but the operation and result is different for each type of structure.


1 Answers

The above answer is actually an INNER join, the correct pig statement should be:

 join a by (id, name) LEFT OUTER, b by (id, name)  
like image 155
Samuel Kerrien Avatar answered Sep 26 '22 05:09

Samuel Kerrien