Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Calculating Points for Category and Sub-Category Combined

Tags:

mysql

sum

I'm currently using a points system that uses the query below to list who has the most points in a category or any category directly below it (using 'links_to' and 'links_from' in the 'category_relations table').

SELECT  u.name, 
        pa.user_id, 
        SUM(IF(pa.plus = '1', pa.points_amount, 0)) - 
        SUM(IF(pa.plus = '0', pa.points_amount, 0)) AS points
FROM    points_awarded pa, Users u
WHERE   u.user_id = pa.user_id AND 
        (
           category_id = '" . $category_id . "' OR 
           category_id IN
               (
                   SELECT links_to
                   FROM category_relations
                   WHERE links_from = '" . $category . "'
               )
        ) 
GROUP BY user_id 
ORDER BY points DESC 
LIMIT 50

What I'm now looking to do is switch this query to look for all categories for a specific user instead of looking for all users for a specific category. Below is the query i've tried to use, but this only includes points for the category and not sub categories directly below.

SELECT   u.name, 
         pa.user_id, 
         pa.category_id,
         SUM(IF(pa.plus = '1', pa.points_amount, 0)) - 
         SUM(IF(pa.plus = '0',pa.points_amount, 0)) AS points
FROM     points_awarded pa, Users u
WHERE    u.user_id = pa.user_id AND 
         u.user_id = '" . $user_id . "'
GROUP BY pa.category_id 
ORDER BY points DESC 
LIMIT 50
like image 291
john Avatar asked Jul 29 '12 14:07

john


2 Answers

Solution

From your example queries, I was able to derive a simplification of your table structure which provided me with just enough information to answer your question. You can use this solution:

SELECT   a.user_id, 
         a.name, 
         b.category_id,
         SUM(IF(d.plus = '1', d.points_amount, 0)) -
         SUM(IF(d.plus = '0', d.points_amount, 0)) AS points
FROM     Users a
JOIN     points_awarded b ON a.user_id = b.user_id
JOIN     (
         SELECT links_from, links_to 
         FROM   category_relations 
         UNION
         SELECT links_from, links_from
         FROM   category_relations 
         ) c ON b.category_id = c.links_from
JOIN     points_awarded d ON c.links_to = d.category_id
WHERE    a.user_id = $user_id
GROUP BY a.user_id,
         a.name,
         b.category_id
ORDER BY points DESC
LIMIT    50

Where $user_id is the user id parameter to the query.


Breakdown

Basically the key component of this query is the way we select the category_relations table.

Since you want the sum of subcategory points (by each parent category) plus their respective parent category's points, we can tack on the parent category vertically via UNION and essentially make it a subcategory of itself. This will then factor nicely into the GROUP BY and SUM.

Say we have some (simplified) data like this:

Users
------------------
user_id  |  name
433      |  Zane

points_awarded
------------------
user_id  |  category_id  |  plus  |  points_amount
433      |  1            |  1     |  785
433      |  2            |  1     |  871
433      |  3            |  1     |  236
433      |  4            |  0     |  64
433      |  5            |  0     |  12
433      |  7            |  1     |  897
433      |  8            |  1     |  3
433      |  9            |  0     |  48
433      |  10           |  1     |  124
433      |  14           |  0     |  676

category_relations
------------------
links_from | links_to
1          | 2
1          | 3
1          | 4
5          | 8
5          | 9
7          | 10
7          | 14

In order to differentiate between parent and sub categories, the query does a self-cross-reference join on the points_awarded table using the two fields in the category_relations table.

If we just joined on the category_relations table as is without the UNION, our resulting join would look like something like (columns simplified):

points_awarded ⋈ (links_from) category_relations ⋈ (links_to) points_awarded:

category_id  |  points_amount  |  category_id  |  points_amount
1            |  785            |  2            |  871
1            |  785            |  3            |  236
1            |  785            |  4            |  64
5            |  12             |  8            |  3
5            |  12             |  9            |  48
7            |  897            |  10           |  124
7            |  897            |  14           |  676

As you can see, the left-most category_id is the parent categories, and the right-most category_id is the respective subcategories. We can easily group on the first category_id and SUM the second points_amount...

But wait, we need the parent categorys' points_amount to be included too! How do we get the first points_amount distinctly into the second points_amount? This is where the UNION comes into play.

Before we perform our self-cross-reference join, we subselect the category_relations table so as to alter it slightly:

SELECT links_from, links_to FROM category_relations
UNION
SELECT links_from, links_from FROM category_relations

Which then results in:

category_relations (subselected)
------------------
links_from | links_to
1          | 1       <-- parent category
1          | 2
1          | 3
1          | 4
5          | 5       <-- parent category
5          | 8
5          | 9
7          | 7       <-- parent category
7          | 10
7          | 14

Where we've essentially made each parent category a subcategory of itself. We use the result of this in our join which should then produce:

category_id  |  points_amount  |  category_id  |  points_amount
1            |  785            |  1            |  785
1            |  785            |  2            |  871
1            |  785            |  3            |  236
1            |  785            |  4            |  64
5            |  12             |  5            |  12             
5            |  12             |  8            |  3
5            |  12             |  9            |  48
7            |  897            |  7            |  897            
7            |  897            |  10           |  124
7            |  897            |  14           |  676

There, NOW the parent category is factored into the SUM and we can now group on the first category_id, summing the second points_amount (ignore the first points_amount at this point as it is irrelevant), giving you the desired output.

like image 100
Zane Bien Avatar answered Oct 24 '22 08:10

Zane Bien


I'm pretty sure there's an easier way to do this with joining to the category_relations table but its difficult not seeing any visual examples of the tables... I think this should work:

SELECT   u.name
        ,pa.user_id
        ,pa.category_id,
        SUM(IF(pa.plus = '1', pa.points_amount, 0)) - SUM(IF(pa.plus = '0',pa.points_amount, 0)) AS points
FROM        points_awarded pa
JOIN        Users u
USING       (user_id)
WHERE       category_id IN(
    SELECT      links_to
    FROM        category_relations
    WHERE       links_from = (
        SELECT      category_id
        FROM        points_awarded
        WHERE       user_id = '" . $user_id . "'
    )
)       
GROUP BY    pa.category_id
ORDER BY    points DESC
LIMIT       50

You weren't seeing subcategories in your posted example above because you weren't including records from the category_relations table. Instead of finding linked categories by the queried category_id, we're using the sub-sub-query to find all category_ids associated with the user. Give this a shot, and if this isnt right, please post some small snippets showing the table data.

like image 2
WebChemist Avatar answered Oct 24 '22 07:10

WebChemist