I often find myself performing a couple of independent joins off a table. For instance, say we have the table collections
, which has independent one-to-N relationships with both photos
and songs
, where N is from zero to many.
Now, say we want to get a collection, and both its (independently) associated photos and songs.
I would typically use something like this:
SELECT
collections.collectionid as collectionid,
photos.name as photo_name,
songs.name as song_name
FROM collections
LEFT JOIN photos ON collections.collectionid = photos.collectionid
LEFT JOIN songs ON collections.collectionid = songs.collectionid
WHERE collections.collectionid = 14
Of course, left-joining one table to two other tables, if the first join results in M
rows and the second in N
rows, gives M * N
rows. This would seem suboptimal in terms of database-traffic and performance.
+--------------+------------+-----------+
| collectionid | photo_name | song_name |
+--------------+------------+-----------+
| 14 | 'x' | 'a' | \
| 14 | 'x' | 'b' | - Each photo is returned 3 times,
| 14 | 'x' | 'c' | / because 3 songs are returned.
| 14 | 'y' | 'a' | \
| 14 | 'y' | 'b' |
| 14 | 'y' | 'c' | /
+--------------+------------+-----------+
Alternatively, you can perform two selects: two separate queries, each joining collections
to a different table, giving M + N
rows:
SELECT
collections.collectionid as collectionid
song.name as song_name
FROM collections
LEFT JOIN songs on collections.collectionid = songs.collectionid
WHERE collections.collectionid = 14
and:
SELECT
collections.collectionid as collectionid
photos.name as photo_name
FROM collections
LEFT JOIN photos on collections.collectionid = photos.collectionid
WHERE collections.collectionid = 14
giving:
+--------------+------------+ +--------------+------------+
| collectionid | song_name | | collectionid | photo_name |
+--------------+------------+ +--------------+------------+
| 14 | 'a' | | 14 | 'x' |
| 14 | 'b' | | 14 | 'y' |
| 14 | 'c' | +--------------+------------+
+--------------+------------+
My question: What is the best way to handle this?
Neither of the above seems optimal. So, is there another way that results in M + N
rows, yet can be done in a single query?
Your first option (two independent JOINs) doesn't seem to provide you with a very useful result set (because the two subsidiary tables produce a semi-cartesian product and you have to de-duplicate the results in your application code).
The second option (two separate queries) is okay, unless you want to treat the results of the two queries as a single set for presentation purposes (for instance, sort them all together by a date field).
The best solution, I think, is to combine the two queries into one with UNION ALL
, producing a single result set with only the rows you actually want:
SELECT
collections.collectionid as collectionid,
photos.name as photo_name,
'photo' as document_type
FROM collections
LEFT JOIN photos on collections.collectionid = photos.collectionid
WHERE collections.collectionid = 14
UNION ALL
SELECT
collections.collectionid as collectionid,
song.name as photo_name
'song' as document_type
FROM collections
LEFT JOIN songs on collections.collectionid = songs.collectionid
WHERE collections.collectionid = 14
This kind of result set can be ORDERed BY
any field across the entire, combined set of records, allowing (for instance) to get the 20 most recent documents attached to the collection regardless of what type they are.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With