Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sql serve Full Text Search with Containstable is very slow when Used in JOIN!

I am using sql 2008 full text search and I am having serious issues with performance depending on how I use Contains or ContainsTable.

Here are sample: (table one has about 5000 records and there is a covered index on table1 which has all the fields in the where clause. I tried to simplify the statements so forgive me if there is syntax issues.)

Scenario 1:

select * from table1 as t1
where t1.field1=90
and   t1.field2='something'
and   Exists(select top 1 * from containstable(table1,*, 'something') as t2 
where t2.[key]=t1.id)

results: 10 second (very slow)

Scenario 2:

select * from table1 as t1
join containstable(table1,*, 'something') as t2 on t2.[key] = t1.id
where t1.field1=90
and   t1.field2='something'

results: 10 second (very slow)

Scenario 3:

Declare @tbl Table(id uniqueidentifier primary key)
insert into @tbl select {key] from containstable(table1,*, 'something')

select * from table1 as t1
where t1.field1=90
and   t1.field2='something'
and  Exists(select id from @tbl as tbl where id=req1.id)

results: fraction of a second (super fast)

Bottom line, it seems if I use Containstable in any kind of join or where clause condition of a select statement that also has other conditions, the performance is really bad. In addition if you look at profiler, the number of reads from the database goes to the roof. But if I first do the full text search and put results in a table variable and use that variable everything goes super fast. The number of reads are also much lower. It seems in "bad" scenarios, somehow it gets stuck in a loop which causes it to read many times from teh database but of course I don't understant why.

Now the question is first of all whyis that happening? and question two is that how scalable table variables are? what if it results to 10s of thousands of records? is it still going to be fast.

Any ideas? Thanks

like image 580
Bob Avatar asked May 01 '10 17:05

Bob


People also ask

Is Full-text search faster?

While conventional searches use pattern matching(grep/regex) methods and scanning through the documents, full-text search promises fast retrieval of data with advanced indexing and more intuitive search results based on relevance.

How does SQL full-text search work?

If a SQL query includes a full-text search query, the query is sent to the Full-Text Engine, both during compilation and during execution. The query result is matched against the full-text index. Full-Text Engine. The Full-Text Engine in SQL Server is fully integrated with the query processor.


1 Answers

I spent quite sometime on this issue, and based on running many scenarios, this is what I figured out:

if you have Contains or ContainsTable anywhere in your query, that is the part that gets executed first and rather independently. Meaning that even if the rest of the conditions limit your search to only one record, neither Contains nor containstable care about that. So this is like a parallel execution.

Now since fulltext search only returns a Key field, it immediately looks for the Key as the first field of other indexes chosen for the query. So for the example above, it looks for the index with [key], field1, field2. The problem is that it chooses an index for the rest of query based on the fields in the where clause. so for the example above it picks the covered index that I have which is something like field1, field2, Id. (Id of the table is the same as the [Key] returned from the full text search). So summary is:

  1. executes containstable
  2. executes the rest of the query and pick an index based on where clause of the query
  3. It tries to merge these two. Therefore, if the index that it picked for the rest of the query starts with the [key] field, it is fine. However, if the index doesn't have the [key] field as the first key, it starts doing loops. It does not even do a table scan, otherwise going through 5000 records would not be that slow. The way it does the loop is that it runs the loop for the total number of results from FTS multiplied by total number of results from the rest of the query. So if the FTS is returning 2000 records and the rest of the query returns 3000, it loops 2000*3000= 6,000,000. I donot understand why.

So in my case it does the full text search, then it does he rest of the query but picks the covered index that I have which is based on field1, field2,id (which is wrong) and as the result it screws up. If I change my covered index to Id, field1, field2 everything would be very fast.

My expection was that FTS returns bunch of [key], the rest of the query return bunch of [Id] and then the Id should be matched against [key].

Of course, I tried to simplify my query here, but the actual query is much more complicated and I cannot just change the index. I also do have scenarios where the text passed in full text is blank and in those scenarios I donot even want to join with containstable. In those cases changing my covered index to have the id field as the first field, will generate disaster.

Anyways, for now I chose the temp table solution since it is working for me. I am also limiting the result to a few thousand which helps with the potential performance issues of table variables when the number of records go too high.

thanks

like image 56
Bob Avatar answered Sep 21 '22 06:09

Bob