Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Unordered results in SQL

I have read it over and over again that SQL, at its heart, is an unordered model. That means executing the same SQL query multiple times can return result-set in different order, unless there's an "order by" clause included. Can someone explain why can a SQL query return result-set in different order in different instances of running the query? It may not be the case always, but its certainly possible.

Algorithmically speaking, does query plan not play any role in determining the order of result-set when there is no "order by" clause? I mean when there is query plan for some query, how does the algorithm not always return data in the same order?

Note: Am not questioning the use of order by, am asking why there is no-guarantee, as in, am trying to understand the challenges due to which there cannot be any guarantee.

like image 291
Chrysalis Avatar asked Jan 27 '14 00:01

Chrysalis


3 Answers

Some SQL Server examples where the exact same execution plan can return differently ordered results are

  1. An unordered index scan might be carried out in either allocation order or key order dependant on the isolation level in effect.
  2. The merry go round scanning feature allows scans to be shared between concurrent queries.
  3. Parallel plans are often non deterministic and order of results might depend on the degree of parallelism selected at runtime and concurrent workload on the server.
  4. If the plan has nested loops with unordered prefetch this allows the inner side of the join to proceed using data from whichever I/Os happened to complete first
like image 115
Martin Smith Avatar answered Oct 08 '22 23:10

Martin Smith


Martin Smith has some great examples, but the absolute dead simple way to demonstrate when SQL Server will change the plan used (and therefore the ordering that a query without ORDER BY will be used, based on the different plan) is to add a covering index. Take this simple example:

CREATE TABLE dbo.floob
(
  blat INT PRIMARY KEY, 
  x VARCHAR(32)
);

INSERT dbo.floob VALUES(1,'zzz'),(2,'aaa'),(3,'mmm');

This will order by the clustered PK:

SELECT x FROM dbo.floob;

Results:

x
----
zzz
aaa
mmm

Now, let's add an index that happens to cover the query above.

CREATE INDEX x ON dbo.floob(x);

The index causes a recompile of the above query when we run it again; now it orders by the new index, because that index provides a more efficient way for SQL Server to return the results to satisfy the query:

SELECT x FROM dbo.floob;

Results:

x
----
aaa
mmm
zzz

Take a look at the plans - neither has a sort operator, they are just - without any other ordering input - relying on the inherent order of the index, and they are scanning the whole index because they have to (and the cheapest way for SQL Server to scan the index is in order). (Of course even in these simple cases, some of the factors in Martin's answer could influence a different order; but this holds true in the absence of any of those factors.)

As others have stated, the ONLY WAY TO RELY ON ORDER is to SPECIFY AN ORDER BY. Please write that down somewhere. It doesn't matter how many scenarios exist where this belief can break; the fact that there is even one makes it futile to try to find some guidelines for when you can be lazy and not use an ORDER BY clause. Just use it, always, or be prepared for the data to not always come back in the same order.

Some related thoughts on this:

  • Bad habits to kick : relying on undocumented behavior
  • Why people think some SQL Server 2000 behaviors live on… 12 years later
like image 45
Aaron Bertrand Avatar answered Oct 08 '22 22:10

Aaron Bertrand


Quote from Wikipedia:

"As SQL is a declarative programming language, SELECT queries specify a result set, but do not specify how to calculate it. The database translates the query into a "query plan" which may vary between executions, database versions and database software. This functionality is called the "query optimizer" as it is responsible for finding the best possible execution plan for the query, within applicable constraints."

It all depends on what the query optimizer picks as a plan - table scan, index scan, index seek, etc.

Other factors that might influence picking a plan are table/index statistics and parameter sniffing to name a few.

In short, the order is never guaranteed without an ORDER BY clause.

like image 24
CRAFTY DBA Avatar answered Oct 08 '22 23:10

CRAFTY DBA