Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Postgres JOIN implementation

Could someone explain or point me to a resource that explains how a relational database (like Postgres or MySQL, I use Postgres more though) implements joins?

For instance, I can roughly tell you that indexes might be made of a B tree where nodes are conditions that might be present in a where clause. This index is built up whenever a change involves the index (like an insert). And from this information, I can assume that an unindexed column will be faster at insert and that an index might not help with certain searches, like regex or like pattern matching.

I'm looking to have a similar or better understanding of what happens when you do one or more joins.

like image 440
tau Avatar asked May 04 '14 19:05

tau


People also ask

How does join work in PostgreSQL?

The PostgreSQL Joins clause is used to combine records from two or more tables in a database. A JOIN is a means for combining fields from two tables by using values common to each.

Which join is faster in PostgreSQL?

Nested loop joins are particularly efficient if the outer relation is small, because then the inner loop won't be executed too often.

What type of join is join in Postgres?

INNER JOIN (simple join) It is the most common type of join. PostgreSQL INNER JOINS return all rows from multiple tables where the join condition is met.


1 Answers

In Postgresql, the planner/optimizer computes which of the 3 following methods to use (from PostgreSQL documentation):

nested loop join: The right relation is scanned once for every row found in the left relation. This strategy is easy to implement but can be very time consuming. (However, if the right relation can be scanned with an index scan, this can be a good strategy. It is possible to use values from the current row of the left relation as keys for the index scan of the right.)

merge join: Each relation is sorted on the join attributes before the join starts. Then the two relations are scanned in parallel, and matching rows are combined to form join rows. This kind of join is more attractive because each relation has to be scanned only once. The required sorting might be achieved either by an explicit sort step, or by scanning the relation in the proper order using an index on the join key.

hash join: the right relation is first scanned and loaded into a hash table, using its join attributes as hash keys. Next the left relation is scanned and the appropriate values of every row found are used as hash keys to locate the matching rows in the table.

like image 141
rofls Avatar answered Oct 14 '22 05:10

rofls