I'm reading the book SQL Antipatterns and trying to understand its example of building a "tree" with a self-referential table such as
Comments
-------------------------------------------------------------------
comment_id | parent_id | author | comment
-------------------------------------------------------------------
1 NULL Fran What's the cause of this bug?
2 1 Ollie I think it's a null pointer.
3 2 Fran No, I checked for that.
4 1 Kukla We need to check for invalid input.
5 4 Ollie Yes, that's a bug.
6 4 Fran Yes, please add a check.
7 6 Kukla That fixed it.
So the book says
You can retrive a comment and its immediate children using a relatively simple query:
SELECT c1.*, c2.* FROM Comments c1 LEFT OUTER JOIN Comments c2 ON c2.parent_id = c1.comment_id
I'm trying to understand how this works. The way I've always understood left outer joins between a table t1 and t2 is that you take all the rows from t1 and for the rows where the ON clause is not satisfied you fill in NULL for the columns from the second table. In this case there is only one table, but I can imagine that the query is taking place with two tables where the second one is a copy of the first. Still, I don't understand how that query returns
two levels of the tree.
What exactly is the resulting table and can you walk me through the logic of how it is resulted?
The resulting table will be the following:
Comments
-------------------------------------------------------------------------------------------------------------------------------------------------------------------
c1.comment_id | c1.parent_id | c1.author | c1.comment | c2.comment_id | c2.parent_id | c2.author | c2.comment |
-------------------------------------------------------------------------------------------------------------------------------------------------------------------
1 NULL Fran What's the cause of this bug? 2 1 Ollie I think it's a null pointer.
1 NULL Fran What's the cause of this bug? 4 1 Kukla We need to check for invalid input.
2 1 Ollie I think it's a null pointer. 3 2 Fran No, I checked for that.
3 2 Fran No, I checked for that. NULL NULL NULL NULL
4 1 Kukla We need to check for invalid input. 5 4 Ollie Yes, that's a bug.
4 1 Kukla We need to check for invalid input. 6 4 Fran Yes, please add a check.
5 4 Ollie Yes, that's a bug. NULL NULL NULL NULL
6 4 Fran Yes, please add a check. 7 6 Kukla That fixed it.
7 6 Kukla That fixed it. NULL NULL NULL NULL
In the ON clause, we have c2.parent_id = c1.comment_id. This will mean that the "right" table (c2)'s parent.id will join with the "left" table (c1)'s comment_id.
The table branches out to itself by mapping each row with its child comments. The result on the right side (c2) will be all the children of c1 entries, which repeat for each child. Since we're doing a left join, rows without children will simply return NULL on the c2 columns.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With