Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is a GROUP BY on UNIQUE key calculates all the groups before applying LIMIT clause?

If I GROUP BY on a unique key, and apply a LIMIT clause to the query, will all the groups be calculated before the limit is applied?

If I have hundred records in the table (each has a unique key), Will I have 100 records in the temporary table created (for the GROUP BY) before a LIMIT is applied?

A case study why I need this:

Take Stack Overflow for example.

Each query you run to show a list of questions, also shows the user who asked this question, and the number of badges he has.

So, while a user<->question is one to one, user<->badges is one has many.

The only way to do it in one query (and not one on questions and another one on users and then combine results), is to group the query by the primary key (question_id) and join+group_concat to the user_badges table.

The same goes for the questions TAGS.

Code example:
Table Questions:
question_id  (int)(pk)|   question_body(varchar)


Table tag-question:
question-id (int) | tag_id (int)


SELECT:

SELECT quesuestions.question_id,
       questions.question_body,
       GROUP-CONCAT(tag_id,' ') AS 'tags-ids'
FROM
       questions
   JOIN
       tag_question
   ON
       questions.question_id=tag-question.question-id
GROUP BY
       questions.question-id
LIMIT 15
like image 601
Itay Moav -Malimovka Avatar asked Feb 02 '26 17:02

Itay Moav -Malimovka


2 Answers

Yes, the order the query executes is:

  • FROM
  • WHERE
  • GROUP
  • HAVING
  • SORT
  • SELECT
  • LIMIT

LIMIT is the last thing calculated, so your grouping will be just fine.

Now, looking at your rephrased question, then you're not having just one row per group, but many: in the case of stackoverflow, you'll have just one user per row, but many badges - i.e.

(uid, badge_id, etc.)
(1, 2, ...)
(1, 3, ...)
(1, 12, ...)

all those would be grouped together.

To avoid full table scan all you need are indexes. Besides that, if you need to SUM, for example, you cannot avoid a full scan.

EDIT:

You'll need something like this (look at the WHERE clause):

SELECT
  quesuestions.question_id,
  questions.question_body,
  GROUP_CONCAT(tag_id,' ') AS 'tags_ids'
FROM
  questions q1
  JOIN tag_question tq
    ON q1.question_id = tq.question-id
WHERE
  q1.question_id IN (
    SELECT
      tq2.question_id
    FROM
      tag_question tq2
        ON q2.question_id = tq2.question_id
      JOIN tag t
        tq2.tag_id = t.tag_id
    WHERE
      t.name = 'the-misterious-tag'
  )
GROUP BY
  q1.question_id
LIMIT 15
like image 121
Seb Avatar answered Feb 04 '26 05:02

Seb


LIMIT does get applied after GROUP BY.

Will the temporary table be created or not, depends on how your indexes are built.

If you have an index on the grouping field and don't order by the aggregate results, then an INDEX SCAN FOR GROUP BY is applied, and each aggregate is counted on the fly.

That means that if you don't select an aggregate due to the LIMIT, it won't ever be calculated.

But if you order by an aggregate, then, of course, all of them need to be calculated before they can be sorted.

That's why they are calculated first and then the filesort is applied.

Update:

As for your query, see what EXPLAIN EXTENDED says for it.

Most probably, question_id is a PRIMARY KEY for your table, and most probably, it will be used in a scan.

That means no filesort will be applies and the join itself will not ever happen after the 15'th row.

To make sure, rewrite your query as following:

SELECT question_id,
       question_body,
       (
       SELECT  GROUP_CONCAT(tag_id, ' ')
       FROM    tag_question t
       WHERE   t.question_id = q.question_id
       )
FROM   questions q
ORDER BY
       question_id
LIMIT 15
  • First, it is more readable,
  • Second, it is more efficient, and
  • Third, it will return even untagged questions (which your current query doesn't).
like image 30
Quassnoi Avatar answered Feb 04 '26 07:02

Quassnoi



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!