Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Select MAX or Order By Limit 1

MIN/MAX vs ORDER BY and LIMIT

To follow up on this question: I found some results very different from what Sean McSomething describes:

I have a table with about 300M rows.

Select max(foo) from bar; takes about 15 sec. to run

Select foo from bar order by foo desc limit 1; takes 3 sec. to run

Sean's statement "It looks like MIN() is the way to go - it's faster in the worst case, indistinguishable in the best case" just doesn't hold for this case...but I have no idea why. Can anyone offer an explanation?

Edit: Since I am unable to show the table's structure here: assume that bar is a table in an ndb_cluster with no relations, foo is an arbitrary data point with no index.

like image 489
Mr Griever Avatar asked Nov 11 '10 17:11

Mr Griever


People also ask

What is Max () +1 SQL?

MAX(x) - 1 simply means the max value of x in the table minus one. You can always use parenthesis and aliases ( as some_cool_name ) to make thing clearer, or to change names in the result. But the first syntax is perfectly valid.

What is the use of SELECT 1?

The statement 'select 1' from any table name means that it returns only 1. For example, If any table has 4 records then it will return 1 four times.

How do you SELECT max value?

Discussion: To find the max value of a column, use the MAX() aggregate function; it takes as its argument the name of the column for which you want to find the maximum value. If you have not specified any other columns in the SELECT clause, the maximum will be calculated for all records in the table.

Does ORDER BY column need to be in SELECT?

The column-Name that you specify in the ORDER BY clause does not need to be the SELECT list.


2 Answers

To avoid a full pass, add an INDEX on foo column.

like image 144
Svisstack Avatar answered Oct 13 '22 21:10

Svisstack


I came across this question and thought I'd add what I've found. Note that the columns are indexed. I'm running on MariaDB 10.2.14.

I have a query which looks like SELECT MAX(created) FROM tbl WHERE group=0 AND created IS NOT NULL. There's an index on (group,created) (both are ints, but created can be NULL). There are many entries with group=0, not many where created IS NULL. tbl is using the Aria storage engine.

EXPLAIN shows the index is being used and gives a row count of 46312, with extra saying "Using where; Using index"

Running the query takes around 0.692s, but the status has something interesting:

Handler_read_key: 1 Handler_read_next: 45131 Handler_read_prev: 0

This seems to suggest that the key is being fully scanned for the maximum; using MIN instead of MAX seems to give similar results. This seems to suggest that MIN/MAX actually can't make use of the optimisation to just pick the first/last entry of the index here.

However, if the query is changed to SELECT created FROM tbl WHERE group=0 AND created IS NOT NULL ORDER BY created DESC LIMIT 1, whilst the query seems to take about the same amount of time to run, and EXPLAIN shows the same info, the status shows:

Handler_read_key: 1 Handler_read_next: 0 Handler_read_prev: 0

I get similar results if the order by is changed to ASC. It seems to me that using an ORDER BY...LIMIT can skip an index scan, which could lead to faster queries if there are many rows which match the index condition, if my understanding is correct.
Note that for the above results, there's enough RAM and cache allocated for holding all indexes in cache, so, presumably, index scans are fast.

I haven't done any experiments with other conditions (different MySQL versions and storage engines) but I suppose the moral of this story is, checking status of queries via SHOW SESSION STATUS may help provide answers to these things.
At least in this case, the ORDER BY...LIMIT may be more efficient than MIN/MAX even when an index can be used.

like image 44
zinga Avatar answered Oct 13 '22 21:10

zinga