Calculating SQL Server ROW_NUMBER() OVER() for a derived table

Tags:

In some other databases (e.g. DB2, or Oracle with ROWNUM), I can omit the ORDER BY clause in a ranking function's OVER() clause. For instance:

ROW_NUMBER() OVER()

This is particularly useful when used with ordered derived tables, such as:

SELECT t.*, ROW_NUMBER() OVER()
FROM (
    SELECT ...
    ORDER BY
) t

How can this be emulated in SQL Server? I've found people using this trick, but that's wrong, as it will behave non-deterministically with respect to the order from the derived table:

-- This order here ---------------------vvvvvvvv
SELECT t.*, ROW_NUMBER() OVER(ORDER BY (SELECT 1))
FROM (
    SELECT TOP 100 PERCENT ...
    -- vvvvv ----redefines this order here
    ORDER BY
) t

A concrete example (as can be seen on SQLFiddle):

SELECT v, ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) RN
FROM (
  SELECT TOP 100 PERCENT 1 UNION ALL
  SELECT TOP 100 PERCENT 2 UNION ALL
  SELECT TOP 100 PERCENT 3 UNION ALL
  SELECT TOP 100 PERCENT 4
  -- This descending order is not maintained in the outer query
  ORDER BY 1 DESC
) t(v)

Also, I cannot reuse any expression from the derived table to reproduce the ORDER BY clause in my case, as the derived table might not be available as it may be provided by some external logic.

So how can I do it? Can I do it at all?

857

asked Sep 23 '13 14:09

Lukas Eder

1 Answers

The Row_Number() OVER (ORDER BY (SELECT 1)) trick should NOT be seen as a way to avoid changing the order of underlying data. It is only a means to avoid causing the server to perform an additional and unneeded sort (it may still perform the sort but it's going to cost the minimum amount possible when compared to sorting by a column).

All queries in SQL server ABSOLUTELY MUST have an ORDER BY clause in the outermost query for the results to be reliably ordered in a guaranteed way.

The concept of "retaining original order" does not exist in relational databases. Tables and queries must always be considered unordered until and unless an ORDER BY clause is specified in the outermost query.

You could try the same unordered query 100,000 times and always receive it with the same ordering, and thus come to believe you can rely on said ordering. But that would be a mistake, because one day, something will change and it will not have the order you expect. One example is when a database is upgraded to a new version of SQL Server--this has caused many a query to change its ordering. But it doesn't have to be that big a change. Something as little as adding or removing an index can cause differences. And more: Installing a service pack. Partitioning a table. Creating an indexed view that includes the table in question. Reaching some tipping point where a scan is chosen instead of a seek. And so on.

Do not rely on results to be ordered unless you have said "Server, ORDER BY".

114

answered Nov 03 '22 05:11

ErikE

Related questions
                            
                                MYSQL: How to make NULL or empty data default to 0 during insert
                            
                                SQL Database Best Practices - Use of Archive tables?
                            
                                Using two foreign keys as a primary key - MySQL
                            
                                Joining 2 large postgres tables using int8range not scaling well
                            
                                How to do sql joins in lambda?
                            
                                Unable to turn off SQL logging in my Rails production environment
                            
                                Why does this query return a money value?
                            
                                How to get all distinct words of a specified minimum length from multiple columns in a MySQL table?
                            
                                Capturing multiple error messages from a single statement inside TRY CATCH
                            
                                Nested Query or Joins
                            
                                How to choose my primary key?
                            
                                How to store extensible metadata in an ORM-friendly way in .NET?
                            
                                Group by week, how to get empty weeks?
                            
                                Joining against a CSV Column in SQL Server
                            
                                unexpected query success
                            
                                C# , SQL update multiple rows
                            
                                Entity Framework - stored procedure return value
                            
                                javax.naming.NameNotFoundException: Name [comp/env] is not bound in this Context. Unable to find [comp] error with java scheduler
                            
                                Profiler in SQL Server 2012 Express [closed]
                            
                                What does SQL Server do with a timed out request?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Calculating SQL Server ROW_NUMBER() OVER() for a derived table

Tags:

sql

sql-server

tsql

window-functions

ranking-functions

Lukas Eder

People also ask

1 Answers

ErikE

Recent Activity

Donate For Us