I've got two tables in SQL, one with a project and one with categories that projects belong to, i.e. the JOIN would look roughly like:
Project | Category
--------+---------
Foo | Apple
Foo | Banana
Foo | Carrot
Bar | Apple
Bar | Carrot
Qux | Apple
Qux | Banana
(Strings replaced with IDs from a higher normal form, obviously, but you get the point here.)
What I want to do is allow filtering such that users can select any number of categories and results will be filtered to items that are members of all the selected categories. For example, if a user selects categories "Apple" and "Banana", projects "Foo" and "Qux" show up. If a user select categories "Apple", "Banana", and "Carrot" then only the "Foo" project shows up.
The first thing I tried was a simple SELECT DISTINCT Project FROM ... WHERE Category = 'Apple' AND Category = 'Banana', but of course that doesn't work since Apple and Banana show up in the same column in two different rows for any common project.
GROUP BY and HAVING don't do me any good, so tell me: is there an obvious way to do this that I'm missing, or is it really so complicated that I'm going to have to resort to recursive joins?
This is in PostgreSQL, by the way, but of course standard SQL code is always preferable when possible.
See this article in my blog for performance details:
PostgreSQL
: selecting items that belong to all categoriesThe solution below:
Works on any number of categories
Is more efficient that COUNT
and GROUP BY
, since it checks existence of any project / category pair exactly once, without counting.
SELECT *
FROM (
SELECT DISTINCT Project
FROM mytable
) mo
WHERE NOT EXISTS
(
SELECT NULL
FROM (
SELECT 'Apple' AS Category
UNION ALL
SELECT 'Banana'
UNION ALL
SELECT 'Carrot'
) list
WHERE NOT EXISTS
(
SELECT NULL
FROM mytable mii
WHERE mii.Project = mo.Project
AND mii.Category = list.Category
)
)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With