Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using MAX() and COUNT() in the same query

Tags:

mysql

I am trying to figure out what language a user answers in the most, and return by user_id, the language_id they answer in the most and how many times they have answers.

I began by SELECTing a table/sub-table which returns these results:

Table: `sub-selected`
`user_id`    `language_id`    `answers`
  1               1               1
  2               1               1
  1               2               5
  2               2               2
  1               4               3
  1               5               1

This table returns the user_id, the language_id, and how many times that language_id has been answered by the user. I used this query to get it:

SELECT t1.user_id, t2.to_language_id, COUNT(t2.to_language_id) as answers
FROM translation_results as t1
LEFT JOIN translations as t2
ON t2.translation_id = t1.translation_id
GROUP BY t2.to_language_id, t1.user_id

The table structure is:

Table: `translations`
`translation_id`    `from_phrase_id`    `to_language_id`

Table: `translation_results`
`translation_id`    `result_id` PRI-AI    `user_id`

The translations table stores all the translations requested, and the translation_results table stores the answers to those translations and the respective user_id.

So, to sum up the table and to get the user_id, their most answered language_id, and how many times they answered in that language_id, I used:

SELECT t1.user_id, t1.to_language_id, MAX(t1.answers)
FROM (
    //The sub-table
    SELECT t1.user_id, t2.to_language_id, COUNT(t2.to_language_id) as answers
    FROM translation_results as t1
    LEFT JOIN translations as t2
    ON t2.translation_id = t1.translation_id
    GROUP BY t2.to_language_id, t1.user_id
) as t1
GROUP BY t1.user_id, t1.to_language_id

But this does not collapse the table into the desired structure and instead returns:

Table: `sub-selected`
`user_id`    `language_id`    `answers`
  1               1               1
  1               2               5
  1               4               3
  1               5               1
  2               1               1
  2               2               2

I know it is affected by the group by of two clauses, but then if I only group by user_id and do not include to_language_id in my selected columns, I can't know which respective language_id is the most answered. I have also tried sub-queries and a few joins, but I find I constantly need to use MAX(t1.answers) regardless in the selected columns and thus destroys my hopes of collasping the group by correctly. How can I collapse the query correctly instead of having group by find all the unique MAX() combinations of user_id and to_language_id?

like image 377
q.Then Avatar asked Nov 10 '22 09:11

q.Then


1 Answers

To get:

the user_id, their most answered language_id, and how many times they answered in that language_id

you can use variables:

SELECT user_id, language_id, answers
FROM (
  SELECT user_id, language_id, answers,
         @rn:= IF(@uid = user_id,
                  IF(@uid:=user_id, @rn:=@rn+1, @rn:=@rn+1),
                  IF(@uid:=user_id, @rn:=1, @rn:=1)) AS rn
  FROM (SELECT t1.user_id, t2.to_language_id AS language_id, 
               COUNT(t2.to_language_id) as answers     
        FROM translation_results as t1 
        LEFT JOIN translations as t2 
           ON t2.translation_id = t1.translation_id
        GROUP BY t2.to_language_id, t1.user_id 
       ) t
  CROSS JOIN (SELECT @rn:=0, @uid:=0) AS vars
  ORDER BY user_id, answers DESC
) s
WHERE s.rn = 1

There is a limitation however in the above query: if there are more than one language_id sharing the same maximum number of answers for a user_id, then only one will be returned.

Demo here

An alternative way, is to use you query twice as a derived table:

SELECT t1.user_id, language_id, t1.answers
FROM (SELECT t1.user_id, t2.to_language_id AS language_id, 
             COUNT(t2.to_language_id) as answers
      FROM translation_results as t1
      LEFT JOIN translations as t2
         ON t2.translation_id = t1.translation_id
      GROUP BY t2.to_language_id, t1.user_id ) t1
INNER JOIN (      
   SELECT user_id, MAX(answers) AS answers
   FROM (SELECT t1.user_id, t2.to_language_id, 
                COUNT(t2.to_language_id) as answers
         FROM translation_results as t1
         LEFT JOIN translations as t2
            ON t2.translation_id = t1.translation_id
         GROUP BY t2.to_language_id, t1.user_id 
        ) t
   GROUP BY user_id ) t2
ON t1.user_id = t2.user_id AND t1.answers = t2.answers 

This query does not have the limitation of the previous query, but is likely to be less efficient compared to the previous one.

Demo here

like image 52
Giorgos Betsos Avatar answered Nov 14 '22 22:11

Giorgos Betsos