Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Combining free text search with another condition is slow

I have a free text catalog on a simple table on SQL Server 2008R2:

CREATE FULLTEXT CATALOG customer_catalog;
CREATE FULLTEXT INDEX ON customer
 ( 
   name1
 ) 
  KEY INDEX customer_pk
  ON customer_catalog;
ALTER FULLTEXT INDEX ON customer START UPDATE POPULATION;

If I perform the following three queries the first two return almost immediately, while the last one takes ~14 seconds on a table with 100,000 records:

SELECT
        customer_id
    FROM
        customer
    WHERE
        CONTAINS(customer.*, 'nomatch');

SELECT
        customer_id
    FROM
        customer
    WHERE
        customer.customer_id = 0;

SELECT
        customer_id
    FROM
        customer
    WHERE
        CONTAINS(customer.*, 'nomatch')
            OR customer.customer_id = 0;

Here are the queryplans:

enter image description here

Why is the third query so much slower? Can I do anything to improve it or do I need to split the query?

like image 988
Rasmus Faber Avatar asked May 08 '13 10:05

Rasmus Faber


3 Answers

It's hard to say why, but it appears SQL Server is choosing an inefficient query plan. Here are some suggestions:

Update the statistics on the table:

UPDATE STATISTICS dbo.customer

Once the statistics are up to date, you could try your queries again and see if there is improvement.

Something else is that for the combined OR statement, SQL Server is using an index scan, instead of a seek. You could try the FORCESEEK hint and see if that makes a difference:

SELECT customer_id
FROM customer WITH (FORCESEEK)
WHERE CONTAINS(customer.*, 'nomatch')
OR customer.customer_id = 0;

One other option, as you mentioned, is to split the statements. The following UNION performs just as well as your first two statements combined:

SELECT customer_id FROM customer
WHERE CONTAINS(customer.*, 'nomatch')

UNION

SELECT customer_id FROM customer
WHERE customer.customer_id = 0

Update - changed above query to UNION instead of UNION ALL.

As @PondLife indicated in the comments, I meant to do a UNION in the above query instead of UNION ALL. After thinking about it, I also tried with UNION ALL and it seemed to be faster. This assumes you don't care about the duplicate IDs:

SELECT customer_id FROM customer
WHERE CONTAINS(customer.*, 'nomatch')

UNION ALL

SELECT customer_id FROM customer
WHERE customer.customer_id = 0
like image 112
chue x Avatar answered Oct 10 '22 16:10

chue x


The "OR" logicial condition often makes queries run very slowly : / Often, the best option is to use UNION (ALL).

In your case, I'm quite curious about the usage you make of

SELECT
    customer_id
FROM
    customer
WHERE
    customer.customer_id = 0;

It would only result in a list (maybe empty) of zeros. Is it to count (!) how many customer have an id = 0? Is it to check if any customer has an id of 0?

If it's not to count the zeroes but to know if they are any, then this query should be efficient:

SELECT
    customer_id
FROM
    customer
WHERE
    CONTAINS(customer.*, 'nomatch')
    AND customer.customer_id <> 0
UNION ALL
SELECT TOP(1)
    0
FROM
    customer
WHERE
    customer.customer_id = 0

otherwise the efficient query is this one:

SELECT
    customer_id
FROM
    customer
WHERE
    CONTAINS(customer.*, 'nomatch')
    AND customer.customer_id <> 0
UNION ALL
SELECT
    0
FROM
    customer
WHERE
    customer.customer_id = 0

(I just removed the TOP clause)

like image 27
Serge Avatar answered Oct 10 '22 18:10

Serge


Depending on your MS SQL 2008 R2 sevice pack version, your problem might be related to following Microsoft Connect issue: Full-text performance with "mixed queries"

As per the MS Connect entry, the problem should go away after installing the latest Cumulative Update package for SQL Server 2008 R2.

like image 24
MicSim Avatar answered Oct 10 '22 16:10

MicSim