Why select Top clause could lead to long time cost

Tags:

The following query takes forever to finish. But if I remove the top 10 clause, it finishs rather quickly. big_table_1 and big_table_2 are 2 tables with 10^5 records.

I used to believe that top clause will reduce the time cost, but it's apparently not here. Why???

select top 10 ServiceRequestID
from 
(
    (select * 
     from  big_table_1
     where big_table_1.StatusId=2
    ) cap1
    inner join
      big_table_2 cap2
    on cap1.ServiceRequestID = cap2.CustomerReferenceNumber
    )

330

asked Mar 08 '12 11:03

smwikipedia

3 Answers

There are other stackoverflow discussions on this same topic (links at bottom). As noted in the comments above it might have something to do with indexes and the optimizer getting confused and using the wrong one.

My first thought is that you are doing a select top serviceid from (select *....) and the optimizer may have difficulty pushing the query down to the inner queries and making using of the index.

Consider rewriting it as

select top 10 ServiceRequestID  
from  big_table_1
inner join big_table_2 cap2
on cap1.servicerequestid = cap2.customerreferencenumber
and big_table_1.statusid = 2

In your query, the database is probably trying to merge the results and return them and THEN limit it to the top 10 in the outer query. In the above query the database will only have to gather the first 10 results as results are being merged, saving loads of time. And if servicerequestID is indexed, it will be sure to use it. In your example, the query is looking for the servicerequestid column in a result set that has already been returned in a virtual, unindexed format.

Hope that makes sense. While hypothetically the optimizer is supposed to take whatever format we put SQL in and figure out the best way to return values every time, the truth is that the way we put our SQL together can really impact the order in which certain steps are done on the DB.

SELECT TOP is slow, regardless of ORDER BY

Why is doing a top(1) on an indexed column in SQL Server slow?

149

answered Sep 21 '22 17:09

user158017

I had a similar problem with a query like yours. The query ordered but without the top clause took 1 sec, same query with top 3 took 1 minute.

I saw that using a variable for the top it worked as expected.

The code for your case:

declare @top int = 10;

select top (@top) ServiceRequestID
from 
(
    (select * 
     from  big_table_1
     where big_table_1.StatusId=2
    ) cap1
    inner join
      big_table_2 cap2
    on cap1.ServiceRequestID = cap2.CustomerReferenceNumber
    )

answered Sep 18 '22 17:09

Javier Suero Santos

I cant explain why but I can give an idea:

try adding SET ROWCOUNT 10 before your query. It helped me in some cases. Bear in mind that this is a scope setting so you have to set it back to its original value after running your query.

Explanation: SET ROWCOUNT: Causes SQL Server to stop processing the query after the specified number of rows are returned.

answered Sep 18 '22 17:09

Diego

Related questions
                            
                                Does mysql have the equivalent of Oracle's "analytic functions"?
                            
                                Setting column values as column names in the SQL query result
                            
                                Python SQLite how to get SQL string statement being executed
                            
                                How to reuse a sub query in sql?
                            
                                SQL Server : left join results in fewer rows than in left table
                            
                                Python Peewee execute_sql() example
                            
                                convert rows to string in postgresql
                            
                                SQLite: Preventing Duplicate Rows
                            
                                How to use MySQL like with order by exact match first
                            
                                Why is my t-sql left join not working?
                            
                                Postgres append or set each elements(if not exists) of an array to an array column
                            
                                Update or create nested jsonb value using single update command
                            
                                SQL - improve NOT EXISTS query performance
                            
                                Weighted average in T-SQL (like Excel's SUMPRODUCT)
                            
                                Multiple aggregate functions in one SQL query from the same table using different conditions
                            
                                Ad hoc queries vs stored procedures vs Dynamic SQL [closed]
                            
                                Select & Insert across multiple databases with MySQL
                            
                                SQL Server: how to query when the last transaction log backup has been taken?
                            
                                SQL Server Insert Without INTO
                            
                                SQL delete all rows except some ones

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Why select Top clause could lead to long time cost

Tags:

sql

sql-server

sql-server-2008-r2