Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

mysql one-to-many query with negation and/or multiple criteria

I thought a query like this would be pretty easy because of the nature of relational databases but it seems to be giving me a fit. I also searched around but found nothing that really helped. Here's the situation:

Let's say I have a simple relationship for products and product tags. This is a one-to-many relationship, so we could have the following:

productid  |  tag
========================
1          |  Car
1          |  Black
1          |  Ford
2          |  Car
2          |  Red
2          |  Ford
3          |  Car
3          |  Black
3          |  Lexus
4          |  Motorcycle
4          |  Black
5          |  Skateboard
5          |  Black
6          |  Skateboard
6          |  Green

What's the most efficient way to query for all (Ford OR Black OR Skateboard) AND NOT (Motorcycles OR Green)? Another query I'm going to need to do is something like all (Car) or (Skateboard) or (Green AND Motorcycle) or (Red AND Motorcycle).

There are about 150k records in the products table and 600k records in the tags tables, so the query is going to need to be as efficient as possible. Here's one query that I've been messing around with (example #1), but it seems to be taking about 4 seconds or so. Any help would be much appreciated.

SELECT p.productid
FROM   products p
       JOIN producttags tag1 USING (productid)
WHERE  p.active = 1
       AND tag1.tag IN ( 'Ford', 'Black', 'Skatebaord' )
       AND p.productid NOT IN (SELECT productid
                               FROM   producttags
                               WHERE  tag IN ( 'Motorcycle', 'Green' ));

 

Update

The quickest query I've found so far is something like this. It's taking 100-200ms but it seems pretty inflexible and ugly. Basically I'm grabbing all products that match Ford, Black, or Skateboard. Them I'm concatenating all of the tags for those matched products into a colon-separated string and removing all products that match on :Green: AND :Motorcycle:. Any thoughts?

SELECT p.productid,
       Concat(':', Group_concat(alltags.tag SEPARATOR ':'), ':') AS taglist
FROM   products p
       JOIN producttags tag1 USING (productid)
       JOIN producttags alltags USING (productid)
WHERE  p.active = 1
       AND tag1.tag IN ( 'Ford', 'Black', 'Skateboard' )
GROUP  BY tag1.productid
HAVING ( taglist NOT LIKE '%:Motorcycle:%'
         AND taglist NOT LIKE '%:Green:%' ); 
like image 812
Jeff Avatar asked Oct 23 '12 23:10

Jeff


1 Answers

I'd write the exclusion join with no subqueries:

SELECT p.productid
FROM   products p
INNER JOIN producttags AS t ON p.productid = t.productid
LEFT OUTER JOIN producttags AS x ON p.productid = x.productid 
       AND x.tag IN ('Motorcycle', 'Green')
WHERE  p.active = 1
       AND t.tag IN ( 'Ford', 'Black', 'Skateboard' )
       AND x.productid IS NULL;

Make sure you have an index on products over the two columns (active, productid) in that order.

You should also have an index on producttags over the two columns (productid, tag) in that order.

Another query I'm going to need to do is something like all (Car) or (Skateboard) or (Green AND Motorcycle) or (Red AND Motorcycle).

Sometimes these complex conditions are hard for the MySQL optimizer. One common workaround is to use UNION to combine simpler queries:

SELECT p.productid
FROM   products p
INNER JOIN producttags AS t1 ON p.productid = t1.productid
WHERE  p.active = 1
   AND t1.tag IN ('Car', 'Skateboard')

UNION ALL

SELECT p.productid
FROM   products p
INNER JOIN producttags AS t1 ON p.productid = t1.productid
INNER JOIN producttags AS t2 ON p.productid = t2.productid 
WHERE  p.active = 1
   AND t1.tag IN ('Motorcycle')
   AND t2.tag IN ('Green', 'Red');

PS: Your tagging table is not an Entity-Attribute-Value table.

like image 63
Bill Karwin Avatar answered Sep 26 '22 00:09

Bill Karwin