I am trying to figure out what language a user answers in the most, and return by user_id
, the language_id
they answer in the most and how many times they have answers.
I began by SELECT
ing a table/sub-table which returns these results:
Table: `sub-selected`
`user_id` `language_id` `answers`
1 1 1
2 1 1
1 2 5
2 2 2
1 4 3
1 5 1
This table returns the user_id
, the language_id
, and how many times that language_id
has been answered by the user. I used this query to get it:
SELECT t1.user_id, t2.to_language_id, COUNT(t2.to_language_id) as answers
FROM translation_results as t1
LEFT JOIN translations as t2
ON t2.translation_id = t1.translation_id
GROUP BY t2.to_language_id, t1.user_id
The table structure is:
Table: `translations`
`translation_id` `from_phrase_id` `to_language_id`
Table: `translation_results`
`translation_id` `result_id` PRI-AI `user_id`
The translations
table stores all the translations requested, and the translation_results table stores the answers to those translations and the respective user_id
.
So, to sum up the table and to get the user_id, their most answered language_id
, and how many times they answered in that language_id
, I used:
SELECT t1.user_id, t1.to_language_id, MAX(t1.answers)
FROM (
//The sub-table
SELECT t1.user_id, t2.to_language_id, COUNT(t2.to_language_id) as answers
FROM translation_results as t1
LEFT JOIN translations as t2
ON t2.translation_id = t1.translation_id
GROUP BY t2.to_language_id, t1.user_id
) as t1
GROUP BY t1.user_id, t1.to_language_id
But this does not collapse the table into the desired structure and instead returns:
Table: `sub-selected`
`user_id` `language_id` `answers`
1 1 1
1 2 5
1 4 3
1 5 1
2 1 1
2 2 2
I know it is affected by the group by of two clauses
, but then if I only group by user_id and do not include to_language_id in my selected columns, I can't know which respective language_id is the most answered. I have also tried sub-queries and a few joins, but I find I constantly need to use MAX(t1.answers)
regardless in the selected columns and thus destroys my hopes of collasping the group by
correctly. How can I collapse the query correctly instead of having group by
find all the unique MAX()
combinations of user_id
and to_language_id
?
To get:
the
user_id
, their most answeredlanguage_id
, and how many times they answered in that language_id
you can use variables:
SELECT user_id, language_id, answers
FROM (
SELECT user_id, language_id, answers,
@rn:= IF(@uid = user_id,
IF(@uid:=user_id, @rn:=@rn+1, @rn:=@rn+1),
IF(@uid:=user_id, @rn:=1, @rn:=1)) AS rn
FROM (SELECT t1.user_id, t2.to_language_id AS language_id,
COUNT(t2.to_language_id) as answers
FROM translation_results as t1
LEFT JOIN translations as t2
ON t2.translation_id = t1.translation_id
GROUP BY t2.to_language_id, t1.user_id
) t
CROSS JOIN (SELECT @rn:=0, @uid:=0) AS vars
ORDER BY user_id, answers DESC
) s
WHERE s.rn = 1
There is a limitation however in the above query: if there are more than one language_id
sharing the same maximum number of answers for a user_id
, then only one will be returned.
Demo here
An alternative way, is to use you query twice as a derived table:
SELECT t1.user_id, language_id, t1.answers
FROM (SELECT t1.user_id, t2.to_language_id AS language_id,
COUNT(t2.to_language_id) as answers
FROM translation_results as t1
LEFT JOIN translations as t2
ON t2.translation_id = t1.translation_id
GROUP BY t2.to_language_id, t1.user_id ) t1
INNER JOIN (
SELECT user_id, MAX(answers) AS answers
FROM (SELECT t1.user_id, t2.to_language_id,
COUNT(t2.to_language_id) as answers
FROM translation_results as t1
LEFT JOIN translations as t2
ON t2.translation_id = t1.translation_id
GROUP BY t2.to_language_id, t1.user_id
) t
GROUP BY user_id ) t2
ON t1.user_id = t2.user_id AND t1.answers = t2.answers
This query does not have the limitation of the previous query, but is likely to be less efficient compared to the previous one.
Demo here
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With