SQL massive performance difference using SELECT TOP x even when x is much higher than selected rows

Tags:

I'm selecting some rows from a table valued function but have found an inexplicable massive performance difference by putting SELECT TOP in the query.

SELECT   col1, col2, col3 etc
FROM     dbo.some_table_function
WHERE    col1 = @parameter
--ORDER BY col1

is taking upwards of 5 or 6 mins to complete.

However

SELECT   TOP 6000 col1, col2, col3 etc
FROM     dbo.some_table_function
WHERE    col1 = @parameter
--ORDER BY col1

completes in about 4 or 5 seconds.

This wouldn't surprise me if the returned set of data were huge, but the particular query involved returns ~5000 rows out of 200,000.

So in both cases, the whole of the table is processed, as SQL Server continues to the end in search of 6000 rows which it will never get to. Why the massive difference then? Is this something to do with the way SQL Server allocates space in anticipation of the result set size (the TOP 6000 thereby giving it a low requirement which is more easily allocated in memory)? Has anyone else witnessed something like this?

Thanks

725

asked Sep 08 '09 11:09

Ray

2 Answers

Table valued functions can have a non-linear execution time.

Let's consider function equivalent for this query:

SELECT  (
        SELECT  SUM(mi.value)
        FROM    mytable mi
        WHERE   mi.id <= mo.id
        )
FROM    mytable mo
ORDER BY
        mo.value

This query (that calculates the running SUM) is fast at the beginning and slow at the end, since on each row from mo it should sum all the preceding values which requires rewinding the rowsource.

Time taken to calculate SUM for each row increases as the row numbers increase.

If you make mytable large enough (say, 100,000 rows, as in your example) and run this query you will see that it takes considerable time.

However, if you apply TOP 5000 to this query you will see that it completes much faster than 1/20 of the time required for the full table.

Most probably, something similar happens in your case too.

To say something more definitely, I need to see the function definition.

Update:

SQL Server can push predicates into the function.

For instance, I just created this TVF:

CREATE FUNCTION fn_test()
RETURNS TABLE
AS
RETURN  (
        SELECT  *
        FROM    master
        );

These queries:

SELECT  *
FROM    fn_test()
WHERE   name = @name

SELECT  TOP 1000 *
FROM    fn_test()
WHERE   name = @name

yield different execution plans (the first one uses clustered scan, the second one uses an index seek with a TOP)

101

answered Oct 08 '22 06:10

Quassnoi

I had the same problem, a simple query joining five tables returning 1000 rows took two minutes to complete. When I added "TOP 10000" to it it completed in less than one second. It turned out that the clustered index on one of the tables was heavily fragmented.

After rebuilding the index the query now completes in less than a second.

answered Oct 08 '22 06:10

Stefan Carlsson

Related questions
                            
                                SQL to find upper case words from a column
                            
                                SQL PIVOT SELECT FROM LIST (IN SELECT)
                            
                                SQL Declare Variables
                            
                                How to correctly and efficiently reuse a prepared statement in C# .NET (SQL Server)?
                            
                                Mysql slash asterisk bang [duplicate]
                            
                                psycopg2: (col1, col2) IN my_list: ProgrammingError: syntax error at or near "ARRAY"
                            
                                Conditional UNION ALL in table function
                            
                                Out of Process in memory database table that supports queries for high speed caching
                            
                                Efficient paging with large tables in sql 2008
                            
                                How to Auto-Increment Non-Primary Key? - SQL Server
                            
                                Update and select in one query
                            
                                LINQ to SQL Every Nth Row From Table
                            
                                User-Defined Table Type insertion sometimes causing conversion error
                            
                                Alias for table name in SQL insert statement
                            
                                Parsing SQL like syntax, design pattern
                            
                                Merge Multiple Databases into a Single Database
                            
                                Cannot roll back subtransaction. No transaction or savepoint of that name was found
                            
                                SQL Update Query - An aggregate may not appear in the set list of an UPDATE statement
                            
                                Pass table value type to SQL Server stored procedure via Entity Framework
                            
                                What is the best SQL library for use in Common Lisp? [closed]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

SQL massive performance difference using SELECT TOP x even when x is much higher than selected rows

Tags:

performance

sql

sql-server

tsql

user-defined-functions

Ray

People also ask

2 Answers

Quassnoi

Stefan Carlsson

Recent Activity

Donate For Us