Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Getting no of products in all categories and parent categories contains a keyword

Tags:

sql

mysql

I am trying to fetch all the categories and their count (no of products in that category) of those products where keyword matches. The query I tried doesn't give me the correct result. Also I want the parent categories till level 1 and their count as well.

e.g. I am trying with keyword watch, then category "watches" should be there with some count. Also the parent category "accessories" with the sum of its descendant categories count.

my table structures are:

tblProducts: There are 5 categories of a product, fldCategoryId1, fldCategoryId2, fldCategoryId3, fldCategoryId4 and fldCategoryId5. fldProductStatus should be 'A'

+-----------------------------+-------------------+
| Field                       | Type              |
+-----------------------------+-------------------+
| fldUniqueId                 | bigint(20)        |
| fldCategoryId1              | bigint(20)        |
| fldCategoryId2              | bigint(20)        |
| fldCategoryId3              | bigint(20)        |
| fldCategoryId4              | bigint(20)        |
| fldCategoryId5              | bigint(20)        |
| fldProductStatus            | enum('A','P','D') |
| fldForSearch                | longtext          |
+-----------------------------+-------------------+

tblCategory:

+------------------------------+-----------------------+
| Field                        | Type                  |
+------------------------------+-----------------------+
| fldCategoryId                | bigint(20)            |
| fldCategoryName              | varchar(128)          |
| fldCategoryParent            | int(11)               |
| fldCategoryLevel             | enum('0','1','2','3') |
| fldCategoryActive            | enum('Y','N')         |
+------------------------------+-----------------------+

Search Query:

SELECT count( c.fldCategoryId ) AS cnt, c.fldCategoryLevel, c.fldCategoryParent, c.fldCategoryId, c.fldCategoryName, p.fldForSearch, c.fldCategoryParent
FROM tblCategory c, tblProducts p
WHERE (
    c.fldCategoryId = p.fldCategoryId1
    OR c.fldCategoryId = p.fldCategoryId2
    OR c.fldCategoryId = p.fldCategoryId3
    OR c.fldCategoryId = p.fldCategoryId4
    OR c.fldCategoryId = p.fldCategoryId5
)
AND p.fldProductStatus = 'A'
AND (
    MATCH ( p.fldForSearch )
    AGAINST (
        '+(watches watch)'
        IN BOOLEAN MODE
    )
)
GROUP BY c.fldCategoryId

Note: The table is in the InnoDB engine and have FULLTEXT search index on 'fldForSearch' column.

EDIT: sample data can be found in sqlfiddle

like image 334
ankitr Avatar asked Apr 13 '17 05:04

ankitr


3 Answers

I'm not sure what you mean by:

Also I want the parent categories till level 1 and their count as well.

But the following query will show you a count for each category (including those with 0 found products), and a general rollup:

SELECT 
    c.fldCategoryId, 
    c.fldCategoryLevel, 
    c.fldCategoryName, 
    COUNT( * ) AS cnt
FROM tblCategory c
    LEFT JOIN tblProducts p ON
            (c.fldCategoryId = p.fldCategoryId1
        OR  c.fldCategoryId = p.fldCategoryId2
        OR  c.fldCategoryId = p.fldCategoryId3
        OR  c.fldCategoryId = p.fldCategoryId4
        OR  c.fldCategoryId = p.fldCategoryId5)
        AND p.fldProductStatus = 'A'
        AND MATCH ( p.fldForSearch )
            AGAINST (
                '+(watches watch)'
                IN BOOLEAN MODE
            )
GROUP BY 
    c.fldCategoryId
    c.fldCategoryLevel,  
    c.fldCategoryName
WITH ROLLUP;

Notes:

  • you cannot select p.fldForSearch if you expect a count of all the products in the category. fldForSearch is on a per product basis, it defeats the grouping purpose
  • I left joined with products so it returns the categories with 0 products matching your keywords. If you don't want this to happen just remove the LEFT keyword
  • I haven't checked the MATCH condition I assume it's correct.
like image 97
Sebas Avatar answered Sep 23 '22 06:09

Sebas


Start by not splaying an array (fldCategoryId...) across columns. Instead, add a new table.

Once you have done that, the queries change, such as getting rid of OR clauses.

Hopefully, any further issues will fall into place.

like image 31
Rick James Avatar answered Sep 22 '22 06:09

Rick James


Since your category tree has a fixed height (4 levels), you can create a transitive closure table on the fly with

SELECT c1.fldCategoryId AS descendantId, c.fldCategoryId AS ancestorId
FROM tblcategory c1
LEFT JOIN tblcategory c2 ON c2.fldCategoryId = c1.fldCategoryParent
LEFT JOIN tblcategory c3 ON c3.fldCategoryId = c2.fldCategoryParent
JOIN tblcategory c ON c.fldCategoryId IN (
    c1.fldCategoryId,
    c1.fldCategoryParent,
    c2.fldCategoryParent,
    c3.fldCategoryParent
)

The result will look like

| descendantId | ancestorId |
|--------------|------------|
|            1 |          1 |
|            2 |          1 |
|            2 |          2 |
|          ... |        ... |
|            5 |          1 |
|            5 |          2 |
|            5 |          5 |
|          ... |        ... |

You can now use it in a subquery (derived table) to join it with products using descendantId and with categories using ancestorId. That means that a product from category X will be indirectly associated with all ancestors of X (as well as with X). For example: Category 5 is a child of 2 - and 2 is a child of 1. So all products from category 5 must be counted for categories 5, 2 and 1.

Final query:

SELECT c.*, coalesce(sub.cnt, 0) as cnt
FROM tblCategory c
LEFT JOIN (
    SELECT tc.ancestorId, COUNT(DISTINCT p.fldUniqueId) AS cnt
    FROM tblProducts p
    JOIN (
        SELECT c1.fldCategoryId AS descendantId, c.fldCategoryId AS ancestorId
        FROM tblcategory c1
        LEFT JOIN tblcategory c2 ON c2.fldCategoryId = c1.fldCategoryParent
        LEFT JOIN tblcategory c3 ON c3.fldCategoryId = c2.fldCategoryParent
        JOIN tblcategory c ON c.fldCategoryId IN (
            c1.fldCategoryId,
            c1.fldCategoryParent,
            c2.fldCategoryParent,
            c3.fldCategoryParent
        )
    ) tc ON tc.descendantId IN (
        p.fldCategoryId1,
        p.fldCategoryId2,
        p.fldCategoryId3,
        p.fldCategoryId4,
        p.fldCategoryId5
    )
    WHERE p.fldProductStatus = 'A'
        AND MATCH ( p.fldForSearch )
            AGAINST ( '+(watches watch)' IN BOOLEAN MODE )
    GROUP BY tc.ancestorId
) sub ON c.fldCategoryId = sub.ancestorId

Result for your sample data (without level, since it seems to be wrong anyway):

| fldCategoryId | fldCategoryName | fldCategoryParent | fldCategoryActive | cnt |
|---------------|-----------------|-------------------|-------------------|-----|
|             1 |             Men |                 0 |                 Y |   5 |
|             2 |     Accessories |                 1 |                 Y |   5 |
|             3 |       Men Watch |                 1 |                 Y |   3 |
|             5 |           Watch |                 2 |                 Y |   5 |
|             6 |           Clock |                 2 |                 Y |   3 |
|             7 |     Wrist watch |                 1 |                 Y |   2 |
|             8 |           Watch |                 2 |                 Y |   4 |
|             9 |          watch2 |                 3 |                 Y |   2 |
|            10 |        fastrack |                 8 |                 Y |   3 |
|            11 |           swish |                 8 |                 Y |   2 |
|            12 |         digital |                 5 |                 Y |   2 |
|            13 |          analog |                 5 |                 Y |   2 |
|            14 |            dual |                 5 |                 Y |   1 |

Demos:

  • sqlfiddle
  • rextester

Note that the outer (left joined) subquery is logically not necessary. But from my experience MySQL doesn't perform well without it.

There are still ways for performance optimisation. One is to store the transitive closure table in an indexed temporary table. You can also persist it in a regular table, if categories do rarely change. You can also manage it with triggers.

like image 22
Paul Spiegel Avatar answered Sep 21 '22 06:09

Paul Spiegel