Could someone explain or point me to a resource that explains how a relational database (like Postgres or MySQL, I use Postgres more though) implements join
s?
For instance, I can roughly tell you that indexes might be made of a B tree where nodes are conditions that might be present in a where
clause. This index is built up whenever a change involves the index (like an insert
). And from this information, I can assume that an unindexed column will be faster at insert
and that an index might not help with certain searches, like regex
or like
pattern matching.
I'm looking to have a similar or better understanding of what happens when you do one or more join
s.
The PostgreSQL Joins clause is used to combine records from two or more tables in a database. A JOIN is a means for combining fields from two tables by using values common to each.
Nested loop joins are particularly efficient if the outer relation is small, because then the inner loop won't be executed too often.
INNER JOIN (simple join) It is the most common type of join. PostgreSQL INNER JOINS return all rows from multiple tables where the join condition is met.
In Postgresql, the planner/optimizer computes which of the 3 following methods to use (from PostgreSQL documentation):
nested loop join: The right relation is scanned once for every row found in the left relation. This strategy is easy to implement but can be very time consuming. (However, if the right relation can be scanned with an index scan, this can be a good strategy. It is possible to use values from the current row of the left relation as keys for the index scan of the right.)
merge join: Each relation is sorted on the join attributes before the join starts. Then the two relations are scanned in parallel, and matching rows are combined to form join rows. This kind of join is more attractive because each relation has to be scanned only once. The required sorting might be achieved either by an explicit sort step, or by scanning the relation in the proper order using an index on the join key.
hash join: the right relation is first scanned and loaded into a hash table, using its join attributes as hash keys. Next the left relation is scanned and the appropriate values of every row found are used as hash keys to locate the matching rows in the table.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With