The msdn documentation says that when we write
SELECT TOP(N) ..... ORDER BY [COLUMN]
We get top(n) rows that are sorted by column
(asc
or desc
depending on what we choose)
But if we don't specify any order by, msdn says random
as Gail Erickson
pointed out here. As he points out it should be unspecified
rather then random
. But as
Thomas Lee
points out there that
When TOP is used in conjunction with the ORDER BY clause, the result set is limited to the first N number of ordered rows; otherwise, it returns the first N number of rows ramdom
So, I ran this query on a table that doesn't have any indexes, first I ran this..
SELECT *
FROM
sys.objects so
WHERE
so.object_id NOT IN (SELECT si.object_id
FROM
sys.index_columns si)
AND so.type_desc = N'USER_TABLE'
And then in one of those tables, (in fact I tried the query below in all of those tables returned by above query) and I always got the same rows.
SELECT TOP (2) *
FROM
MstConfigSettings
This always returned the same 2 rows, and same is true for all other tables returned by query 1. Now the execution plans shows 3 steps..
As you can see there is no index look up, it's just a pure table scan, and
The Top
shows actual no of rows to be 2, and so does the Table Scan
; Which is not the case (there I many rows).
But when I run something like
SELECT TOP (2) *
FROM
MstConfigSettings
ORDER BY
DefaultItemId
The execution plan shows
and
So, when I don't apply ORDER BY
the steps are different (there is no sort). But the question is how does this TOP
works when there is no Sort
and why and how does it always gives the same result?
If you don't specify an ORDER BY , then there is NO ORDER defined. The results can be returned in an arbitrary order - and that might change over time, too.
The SELECT TOP clause allows you to limit the number of rows or percentage of rows returned in a query result set. Because the order of rows stored in a table is unspecified, the SELECT TOP statement is always used in conjunction with the ORDER BY clause.
The column-Name that you specify in the ORDER BY clause does not need to be the SELECT list.
There is no guarantee which two rows you get. It will just be the first two retrieved from the table scan.
The TOP
iterator in the execution plan will stop requesting rows once two have been returned.
Likely for a scan of a heap this will be the first two rows in allocation order but this is not guaranteed. For example SQL Server might use the advanced scanning feature which means that your scan will read pages recently read from another concurrent scan.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With