I've used SQL in couple databases engines from time to time several years but have little theoretical knowledge so my question could be very "noobish" for some of you. But it become important to me now so I have to ask. Imagine table Urls with non unique column <code>status</code>. And for the question assume that we have large amount of rows and status has the same value in every record. And imagine we execute many times query: <pre class="prettyprint"><code>SELECT * FROM Urls ORDER BY status </code></pre> <ol> <li>Do we get every time the same row order or not? If we do what will happen if we add some new rows? Does it change order or new records will be appended to end of the results? And if we don't get the same order - on what conditions depend this order? </li> <li>Do <code>ROW_NUMBER() OVER (ORDER BY status)</code> will return the same order as query above or it is based on different ordering mechanism?</li> </ol>

<code>ORDER BY</code> is not stable in SQL Server (nor in any other database, as far as I know). A stable sort is one that returns records in the same order that they are found in the table. The high-level reason is quite simple. Tables are sets. They have no order. So a "stable" sort just doesn't make sense. The lower-level reasons are probably more important. The database could be implementing a parallel sort algorithm. Such algorithms are not, by default, stable. If you want a stable sort, then include a key column in the sorting. This is alluded to in the documentation: <blockquote> To achieve stable results between query requests using OFFSET and FETCH, the following conditions must be met: The underlying data that is used by the query must not change. That is, either the rows touched by the query are not updated or all requests for pages from the query are executed in a single transaction using either snapshot or serializable transaction isolation. For more information about these transaction isolation levels, see SET TRANSACTION ISOLATION LEVEL (Transact-SQL). The ORDER BY clause contains a column or combination of columns that are guaranteed to be unique. </blockquote>

Is ORDER BY and ROW_NUMBER() deterministic?

Tags:

sql

sql-server

tsql

I've used SQL in couple databases engines from time to time several years but have little theoretical knowledge so my question could be very "noobish" for some of you. But it become important to me now so I have to ask.

Imagine table Urls with non unique column status. And for the question assume that we have large amount of rows and status has the same value in every record.

And imagine we execute many times query:

SELECT * FROM Urls ORDER BY status

Do we get every time the same row order or not? If we do what will happen if we add some new rows? Does it change order or new records will be appended to end of the results? And if we don't get the same order - on what conditions depend this order?
Do ROW_NUMBER() OVER (ORDER BY status) will return the same order as query above or it is based on different ordering mechanism?

591

asked Sep 04 '13 11:09

MKB

2 Answers

It's very simple. If you want an ordering that you can rely upon, then you need to include enough columns in your ORDER BY clause such that the combination of all of those columns is unique for each row. Nothing else is guaranteed.

For a single table, you can usually get what you want by listing the columns that are "interesting" to sort by and then including the primary key column(s) afterwards. Since the PK, by itself, guarantees uniqueness, the whole combination is also guaranteed to uniquely define the ordering, e.g. If the Urls table has a primary key of {Site, Page, Ordinal} then the following would give you a dependable result:

SELECT * FROM Urls ORDER BY status, Site, Page, Ordinal

163

answered Sep 30 '22 05:09

Damien_The_Unbeliever

ORDER BY is not stable in SQL Server (nor in any other database, as far as I know). A stable sort is one that returns records in the same order that they are found in the table.

The high-level reason is quite simple. Tables are sets. They have no order. So a "stable" sort just doesn't make sense.

The lower-level reasons are probably more important. The database could be implementing a parallel sort algorithm. Such algorithms are not, by default, stable.

If you want a stable sort, then include a key column in the sorting.

This is alluded to in the documentation:

To achieve stable results between query requests using OFFSET and FETCH, the following conditions must be met:

The underlying data that is used by the query must not change. That is, either the rows touched by the query are not updated or all requests for pages from the query are executed in a single transaction using either snapshot or serializable transaction isolation. For more information about these transaction isolation levels, see SET TRANSACTION ISOLATION LEVEL (Transact-SQL).

The ORDER BY clause contains a column or combination of columns that are guaranteed to be unique.

answered Sep 30 '22 07:09

Gordon Linoff

Related questions
                            
                                Storing a SQL string array, and subsequent querying
                            
                                should we use float as primary key in sql-server
                            
                                How to copy table without primary key to another one
                            
                                Mysql join and sum is doubling result
                            
                                How can I call a SQL function in C#?
                            
                                How can I query row data as columns?
                            
                                PDO and MySQL 'between'
                            
                                Parsing SQL Queries in C++ using Boost.Spirit
                            
                                select rows as well as a total count in one query in mysql
                            
                                I am unable to drop a foreign key in mysql
                            
                                Is there a library for sanitizing query parameters for PostgreSQL or SQL in general, for FreePascal and Delphi?
                            
                                Can I calculate there's how many weekend days between two dates in SQL Server?
                            
                                Mysql Error: #1075
                            
                                SQL: Using DATEADD with bigints
                            
                                Remove confirmation when using DoCmd.RunSQL for INSERT and DELETE
                            
                                Get the nearest longitude and latitude from MSSQL database table?
                            
                                Cumulative sum over a set of rows in mysql
                            
                                SQL Server: Two-level GROUP BY with XML output
                            
                                Lookup Error ORA-00932: inconsistent datatypes: expected DATE got NUMBER
                            
                                Getting current user with a sql trigger

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With