Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to query data based on multiple 'tags' in SQL?

I have three simple tables:

Items

ItemID (int, PK)
ItemName (nvarchar50)
ItemCost (int)

Tags

TagID (int, PK)
TagName (nvarchar50)

ItemTags

ItemID (int, FK->Items)
TagID (int, FK->Tags)

How can I write a query to the effect of 'find all items tagged with tag1 and tag2, but not tag3'?

Do I need to use a completely different schema?

like image 663
Vok Avatar asked Aug 15 '16 12:08

Vok


2 Answers

I like to do this with GROUP BY and HAVING:

SELECT it.ItemId
FROM ItemTags it JOIN
     Tags t
     ON t.TagId = it.TagId
GROUP BY it.ItemId
HAVING SUM(CASE WHEN t.TagName = 'Tag1' THEN 1 ELSE 0 END) > 0 AND
       SUM(CASE WHEN t.TagName = 'Tag2' THEN 1 ELSE 0 END) > 0 AND
       SUM(CASE WHEN t.TagName = 'Tag3' THEN 1 ELSE 0 END) = 0;

Each condition in the HAVING clause checks for one tag. The first two use > 0 to say the tag must exist for the item. The third uses = 0 to say the tag must not exist for the item.

like image 196
Gordon Linoff Avatar answered Nov 03 '22 19:11

Gordon Linoff


You can use OUTER APPLY:

SELECT i.*
FROM Items i
OUTER APPLY (
    SELECT COUNT(*) as TagsCount
    FROM ItemTags it
    INNER JOIN Tags t 
        ON t.TagID = it.TagID
    WHERE i.ItemID = it.ItemID
        AND t.TagName IN ('tag1','tag2') 
) as tt
WHERE TagsCount = 2

At first we get all ItemID's with counted TagsID's. Then join with Items tables with filtering only those who have TagsCount = 2

EDIT#1

Add a sample:

;WITH Items AS (
    SELECT *
    FROM (VALUES
    (1,'Item1',100),(2,'Item2',50),(3,'Item3',90),(4,'Item4',63),(5,'Item5',75)
    )as t(ItemID,ItemName,ItemCost)
)
, Tags AS (
    SELECT *
    FROM (VALUES
    (1,'tag1'),(2,'tag2'),(3,'tag3'),(4,'tag4'),(5,'tag5')
    ) as t(TagID, TagName)
)
, ItemTags AS (
    SELECT *
    FROM (VALUES
    (1,1),(1,2),            --This
    (2,1),(2,2),(2,3),      --and that records we need to get
    (3,1),      (3,3),(3,4),
          (4,2),            (4,5),
    (5,1)
    ) as t(ItemID, TagID)
)

SELECT i.*
FROM Items i
CROSS APPLY (
    SELECT COUNT(*) as TagsCount
    FROM ItemTags it
    INNER JOIN Tags t 
        ON t.TagID = it.TagID
    WHERE i.ItemID = it.ItemID
        AND t.TagName IN ('tag1','tag2')
    HAVING COUNT(*) = 2 
) as tt

Output:

ItemID  ItemName    ItemCost
1       Item1       100
2       Item2       50

EDIT#2

If you want to filter items with no tag3 tag present you can add left join.

SELECT i.*
FROM Items i
CROSS APPLY (
    SELECT COUNT(*) as TagsCount
    FROM ItemTags it
    INNER JOIN Tags t 
        ON t.TagID = it.TagID
    WHERE i.ItemID = it.ItemID
        AND t.TagName IN ('tag1','tag2')
    HAVING COUNT(*) = 2 
) as tin
LEFT JOIN (
    SELECT it.Itemid
    FROM ItemTags it
    INNER JOIN Tags t 
        ON t.TagID = it.TagID 
    WHERE t.TagName IN ('tag3')
    ) tnot
    ON tnot.Itemid = i.itemid
WHERE tnot.ItemId is NULL

If yo want to filter by some tags you can use temporary table with items that has one tags, and has tags you don't need and then join them. Also dynamic SQL may be an option.

like image 34
gofr1 Avatar answered Nov 03 '22 20:11

gofr1