Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can a query with a TOP 10 take infinitely longer to complete than the same query without the TOP 10?

I have a very simple situation:

I have a Table Valued Function called FullTextPagina defined as follows:

select * from Pagina as p where contains(p.PageText, @term)

And then I have 2 queries:

declare @term nvarchar(4000)= N'"DIEGO NUNES J COMBINADO"'

SELECT Id, DtPagina
FROM FullTextPagina(@term)
ORDER BY DtPagina DESC

SELECT TOP 10 Id, DtPagina
FROM FullTextPagina(@term)
ORDER BY DtPagina DESC

They are identical except for the fact that the second includes a TOP 10 statement. And they don't return anything. 0 rows.

The first executes instantly. The seconds takes 1:20m to complete.

Why?

PS:

  1. I have correctly set up the full-text index
  2. I have a non-clustered, non-unique descending index on DtPagina
  3. The execution plan is here: http://i.imgur.com/77vJB.png

EDIT

Responding to @MartinSmith, curiously, the "Number of Executions" for the Table Valued Functions is 1.18 million for the TOP 10 case and 1 for the other case

EDIT 2

Execution plan XML http://tecnologia.novaprolink.com.br/Execution%20plan.xml

EDIT 3

Adding option(recompile) or taking off the parameters don't affect the result

SELECT Id, DtPagina
FROM FullTextPagina(N'"DIEGO NUNES J COMBINADO"')
ORDER BY DtPagina DESC

SELECT TOP 10 Id, DtPagina
FROM FullTextPagina(N'"DIEGO NUNES J COMBINADO"')
ORDER BY DtPagina DESC
OPTION (RECOMPILE)

EDIT 4

Full code for FullTextPagina

USE [RexConsumo_2011_11]
GO

/****** Object:  UserDefinedFunction [dbo].[FullTextPagina]    Script Date: 11/24/2011 11:43:09 ******/
SET ANSI_NULLS ON
GO

SET QUOTED_IDENTIFIER ON
GO

CREATE FUNCTION [dbo].[FullTextPagina] (@term nvarchar(4000))
RETURNS TABLE 
AS
RETURN 
(
    select * from Pagina as p where contains(p.PageText, @term)
)

GO
like image 671
André Pena Avatar asked Nov 24 '11 13:11

André Pena


People also ask

Why does the same query takes different amount of time to run?

The first time you run your query, data is read from storage. The next time you run that query, a lot of the data and indexes will be cached in memory.

Which query will take more time for execution?

There are a number of things that may cause a query to take longer time to execute: Inefficient query – Use non-indexed columns while lookup or joining, thus MySQL takes longer time to match the condition. Table lock – The table is locked, by global lock or explicit table lock when the query is trying to access it.


1 Answers

The problem you are getting is because SQL Server isn't accurately estimating how many rows will match the predicate.

Your query is doing SELECT TOP 10 Id, DtPagina ... ORDER BY DtPagina DESC. There are a couple of options as to how it could do this

Option 1

It could just scan the DtPagina DESC index in order and see if each row matches the full text predicate then exit when the first 10 in index order have been found.

Option 2

  1. Evaluate the full text predicate
  2. Retrieve the DtPagina column values for all the matching rows
  3. Sort them and get the top 10.

When costing the first option the bottom plan shows that it expects to have to scan about 600 rows before getting 10 that match and being able to exit. This is a massive underestimate as in fact no rows match the predicate and it needs to do this for the whole 1,186,533 rows.

When costing the second option from the top plan it can be seen that it assumes that there are 13,846.2 matching rows that will be brought back from the full text index query and need to be joined and sorted. This is a large over estimate as the actual figure is zero.

So these incorrect estimates lead it to favour the first option incorrectly.

I'm not sure what can be done to improve the accuracy of the Full Text indexing statistics. Maybe try rewriting the query using containstable

Edit: This is a bit of a hack but may well work. What if you try

declare @term nvarchar(4000)= N'"DIEGO NUNES J COMBINADO"'
declare @num int = 10

SELECT TOP (@num) Id, DtPagina
FROM FullTextPagina(@term)
ORDER BY DtPagina DESC

Then it will assume TOP 100 which may well be sufficient to tip it into choosing the other more efficient plan.

like image 195
Martin Smith Avatar answered Sep 22 '22 15:09

Martin Smith