Imagine the following tables:
create table boxes( id int, name text, ...);
create table thingsinboxes( id int, box_id int, thing enum('apple,'banana','orange');
And the tables look like:
Boxes: id | name 1 | orangesOnly 2 | orangesOnly2 3 | orangesBananas 4 | misc thingsinboxes: id | box_id | thing 1 | 1 | orange 2 | 1 | orange 3 | 2 | orange 4 | 3 | orange 5 | 3 | banana 6 | 4 | orange 7 | 4 | apple 8 | 4 | banana
How do I select the boxes that contain at least one orange and nothing that isn't an orange?
How does this scale, assuming I have several hundred thousand boxes and possibly a million things in boxes?
I'd like to keep this all in SQL if possible, rather than post-processing the result set with a script.
I'm using both postgres and mysql, so subqueries are probably bad, given that mysql doesn't optimize subqueries (pre version 6, anyway).
SELECT b.*
FROM boxes b JOIN thingsinboxes t ON (b.id = t.box_id)
GROUP BY b.id
HAVING COUNT(DISTINCT t.thing) = 1 AND SUM(t.thing = 'orange') > 0;
Here's another solution that does not use GROUP BY:
SELECT DISTINCT b.*
FROM boxes b
JOIN thingsinboxes t1
ON (b.id = t1.box_id AND t1.thing = 'orange')
LEFT OUTER JOIN thingsinboxes t2
ON (b.id = t2.box_id AND t2.thing != 'orange')
WHERE t2.box_id IS NULL;
As always, before you make conclusions about the scalability or performance of a query, you have to try it with a realistic data set, and measure the performance.
I think Bill Karwin's query is just fine, however if a relatively small proportion of boxes contain oranges, you should be able to speed things up by using an index on the thing
field:
SELECT b.*
FROM boxes b JOIN thingsinboxes t1 ON (b.id = t1.box_id)
WHERE t1.thing = 'orange'
AND NOT EXISTS (
SELECT 1
FROM thingsinboxes t2
WHERE t2.box_id = b.id
AND t2.thing <> 'orange'
)
GROUP BY t1.box_id
The WHERE NOT EXISTS
subquery will only be run once per orange thing, so it's not too expensive provided there aren't many oranges.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With