(The following is a highly simplified description of my problem. The company policy does not allow me to describe the actual scenario in any detail.)
The DB tables involved are:
PRODUCTS:
ID Name
---------
1 Ferrari
2 Lamborghini
3 Volvo
CATEGORIES:
ID Name
----------
10 Sports cars
20 Safe cars
30 Red cars
PRODUCTS_CATEGORIES
ProductID CategoryID
-----------------------
1 10
1 30
2 10
3 20
LOCATIONS:
ID Name
------------
100 Sports car store
200 Safe car store
300 Red car store
400 All cars r us
LOCATIONS_CATEGORIES:
LocationID CategoryID
------------------------
100 10
200 20
300 30
400 10
400 20
400 30
Note that the locations are not directly connected to the products, only the categories. The customer should be able to see a list of locations that can provide all the product categories that the products they want to buy belong to. So, for example:
A customer wants to buy a Ferrari. This would be available from stores in categories 10 or 30. This gives us stores 100, 300 and 400 but not 200.
However, if a customer wants to buy a Volvo and a Lamborghini this would be available from stores in categories 10 and 20. Which only gives us store 400.
Another customer wants to buy a Ferrari and a Volvo. This they could get from a store in either categories 10 + 20 (sporty and safe) or categories 30 + 20 (red and safe).
What I need is a postgres query that takes a number of products and returns the locations where all of them can be found. I got started with arrays and the <@ operator but got lost quickly. Here follows some example SQL that attempts to get stores where a Ferrari and a Lamborghini can be bought. It does not work correctly since it requires the locations to satisfy all the categories that all the selected cars belong to. It returns location 400 only but should return locations 400 and 100.
SELECT l.* FROM locations l
WHERE
(SELECT array_agg(DISTINCT(categoryid)) FROM products_categories WHERE productid IN (1,2))
<@
(SELECT array_agg(categoryid) FROM locations_categories WHERE locationid = l.id);
I hope my description makes sense.
Here is the query. You should insert a list of selected cars Ids pc.ProductId in (1,3)
and in the end you should correct condition to selected cars count so if you select 1 and 3 you should write HAVING COUNT(DISTINCT pc.ProductId) = 2
if you select 3 cars then there have to be 3. This condition in HAVING
give you condition that ALL cars are in these Locations:
SELECT Id FROM Locations l
JOIN Locations_Categories lc on l.Id=lc.LocationId
JOIN Products_Categories pc on lc.CategoryId=pc.CategoryID
where pc.ProductId in (1,3)
GROUP BY l.id
HAVING COUNT(DISTINCT pc.ProductId) = 2
Sqlfiddle demo
For example for one car it will be:
SELECT Id FROM Locations l
JOIN Locations_Categories lc on l.Id=lc.LocationId
JOIN Products_Categories pc on lc.CategoryId=pc.CategoryID
where pc.ProductId in (1)
GROUP BY l.id
HAVING COUNT(DISTINCT pc.ProductId) = 1
Only Ferrary demo Volvo and a Lamborghini demo
(This basically elaborates on @valex's answer, though I didn't realise that until I posted; please accept @valex's not this one).
This can be done using only joins and aggregation.
Build a join tree mapping locations to products, as normal. Then join it with the list of desired products (one-column values rows) and filter the join to only matching product names. You now have one row with the location of a product wherever that product can be found.
Now group by location and return locations where the number of products present is equal to the number we're looking for (for ALL). For ANY we omit the HAVING
filter because any location row returned by the join is what we want.
So:
WITH wantedproducts(productname) AS (VALUES('Volvo'), ('Lamborghini'))
SELECT l."ID"
FROM locations l
INNER JOIN locations_categories lc ON (l."ID" = lc."LocationID")
INNER JOIN categories c ON (c."ID" = lc."CategoryID")
INNER JOIN products_categories pc ON (pc."CategoryID" = c."ID")
INNER JOIN products p ON (p."ID" = pc."ProductID")
INNER JOIN wantedproducts wp ON (wp.productname = p."Name")
GROUP BY l."ID"
HAVING count(DISTINCT p."ID") = (SELECT count(*) FROM wantedproducts);
is what you want, basically.
For "stores with any of the wanted products" queries, drop the HAVING
clause.
You an also ORDER BY
the aggregate if you want to show stores with any match but sort based on number of matches.
You can also add a string_agg(p."Name")
to the SELECT
values-list if you want to list products that can be found at that store.
If you want your input to be an array rather than a values-list, just replace the VALUES (...)
with SELECT unnest($1)
and pass your array as the parameter $1
, or write it literally in place of $1
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With