Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Order by clause execution in SQL

This question isn't about order of executions. It's about just the ORDER BY.

In standard execution is:

  • FROM
  • WHERE
  • GROUP BY
  • HAVING
  • SELECT
  • ORDER BY
  • TOP

EDIT: This question has been more or less the issue of "Does SQL Server apply short circuit evaluation when executing ORDER BY expressions?" The answer is SOMETIMES! I just haven't found a reasonable reason as to why. See Edit #4.

Now suppose I have a statement like this:

DECLARE @dt18YearsAgo AS DATETIME = DATEADD(YEAR,-18,GETDATE());
SELECT
  Customers.Name
FROM
  Customers
WHERE
  Customers.DateOfBirth > @dt18YearsAgo
ORDER BY
  Contacts.LastName ASC, --STATEMENT1
  Contacts.FirstName ASC, --STATEMENT2
  (
   SELECT
     MAX(PurchaseDateTime)
   FROM
     Purchases
   WHERE
     Purchases.CustomerID = Customers.CustomerID
  ) DESC --STATEMENT3

This isn't the real statement I'm trying to execute, but just an example. There are three ORDER BY statements. The third statement is only used for rare cases where the last name and first name match.

If there are no duplicate last names, does SQL Server not execute ORDER BY statements #2 and #3? And, logically, if there are no duplicate last name and first name, does SQL Server note execute statement #3.

This is really for optimization. Reading from the Purchases table should only be a last resort. In the case of my application, it wouldn't be efficient to read every single "PurchaseDateTime" from "Purchases" grouping by "CustomerID".

Please keep the answer related to my question and not a suggestion like building an index for CustomerID, PurchaseDateTime in Purchases. The real question is, does SQL Server skip unnecessary ORDER BY statements?

Edit: Apparently, SQL Server will always execute every statement as long as there is one row. Even with one row, this will give you a divide by zero error:

DECLARE @dt18YearsAgo AS DATETIME = DATEADD(YEAR,-18,GETDATE());
SELECT
  Customers.Name
FROM
  Customers
WHERE
  Customers.DateOfBirth > @dt18YearsAgo
ORDER BY
  Contacts.LastName ASC, --STATEMENT1
  Contacts.FirstName ASC, --STATEMENT2
  1/(Contacts.ContactID - Contacts.ContactID) --STATEMENT3

Edit2: Apparently, this doesn't give divide by zero:

DECLARE @dt18YearsAgo AS DATETIME = DATEADD(YEAR,-18,GETDATE());
SELECT
  Customers.Name
FROM
  Customers
WHERE
  Customers.DateOfBirth > @dt18YearsAgo
ORDER BY
  Contacts.LastName ASC, --STATEMENT1
  Contacts.FirstName ASC, --STATEMENT2
  CASE WHEN 1=0
    THEN Contacts.ContactID
    ELSE 1/(Contacts.ContactID - Contacts.ContactID)
  END --STATEMENT3

Well, the original answer to my question is YES, it does execute, but what's nice is that I can stop execute with a proper CASE WHEN

Edit 3: We can stop execution of an ORDER BY statement with a proper CASE WHEN. The trick, I guess, is to figure out how to use it properly. CASE WHEN will give me what I want, which a short circuit execution in an ORDER BY statement. I compared the Execution Plan in SSMS and depending on the CASE WHEN statement, the Purchases table isn't scanned at all EVEN THOUGH it's a clearly visible SELECT/FROM statement:

DECLARE @dt18YearsAgo AS DATETIME = DATEADD(YEAR,-18,GETDATE());
SELECT
  Customers.Name
FROM
  Customers
WHERE
  Customers.DateOfBirth > @dt18YearsAgo
ORDER BY
  Contacts.LastName ASC, --STATEMENT1
  Contacts.FirstName ASC, --STATEMENT2
  CASE WHEN 1=0
    THEN
    (
     SELECT
       MAX(PurchaseDateTime)
     FROM
       Purchases
     WHERE
       Purchases.CustomerID = Customers.CustomerID
    )
    ELSE Customers.DateOfBirth
  END DESC

Edit 4: Now I'm completely confused. Here's an example by @Lieven

WITH Test (name, ID) AS
(SELECT 'Lieven1', 1 UNION ALL SELECT 'Lieven2', 2)

SELECT * FROM Test ORDER BY name, 1/ (ID - ID)

This yields no divide by zero, which means SQL Server does in fact, do short circuit evaluation on SOME tables, specifically those created with the WITH command.

Trying this with a TABLE variable:

DECLARE @Test TABLE
(
    NAME nvarchar(30),
    ID int
);
INSERT INTO @Test (Name,ID) VALUES('Lieven1',1);
INSERT INTO @Test (Name,ID) VALUES('Lieven2',2);
SELECT * FROM @Test ORDER BY name, 1/ (ID - ID)

will yield a divide by zero error.

like image 999
ShortFuse Avatar asked Jul 13 '12 17:07

ShortFuse


1 Answers

First of all what you are calling "Statements" are no such thing. They are sub-clauses of the ORDER BY (major) clause. The difference is important, because "Statement" implies something separable, ordered and procedural, and SQL sub-clauses are none of those things.

Specifically, SQL sub-clauses (that is, the individual items of a SQL major clause (SELECT, FROM, WHERE, ORDER BY, etc.)) have no implicit (nor explicit) execution order of their own. SQL will re-order them in anyway that it finds convenient and will almost always execute all of them if it execute any of them. In short, SQL Server does not do that kind of "short-circuit" optimizations because they are trivially effective and seriously get in the way of the very different kind of optimizations that it does do (i.e., Statistical Data Access/Operator Optimizations).

So the correct answer to your original question (which you should not have changed) is NO, not reliably. You cannot rely on SQL Server to not use some sub-clause of the ORDER BY, simply because it looks like it does not need to.

The only common exception to this is that the CASE function can (in most circumstances) be used to short-circuit execution paths (within the CASE function though, not outside of it), but only because it is specifically designed for this. I cannot think of anything else in SQL that you can rely on to act like this.

like image 194
RBarryYoung Avatar answered Oct 13 '22 21:10

RBarryYoung