I have a free text catalog on a simple table on SQL Server 2008R2:
CREATE FULLTEXT CATALOG customer_catalog;
CREATE FULLTEXT INDEX ON customer
(
name1
)
KEY INDEX customer_pk
ON customer_catalog;
ALTER FULLTEXT INDEX ON customer START UPDATE POPULATION;
If I perform the following three queries the first two return almost immediately, while the last one takes ~14 seconds on a table with 100,000 records:
SELECT
customer_id
FROM
customer
WHERE
CONTAINS(customer.*, 'nomatch');
SELECT
customer_id
FROM
customer
WHERE
customer.customer_id = 0;
SELECT
customer_id
FROM
customer
WHERE
CONTAINS(customer.*, 'nomatch')
OR customer.customer_id = 0;
Here are the queryplans:
Why is the third query so much slower? Can I do anything to improve it or do I need to split the query?
It's hard to say why, but it appears SQL Server is choosing an inefficient query plan. Here are some suggestions:
Update the statistics on the table:
UPDATE STATISTICS dbo.customer
Once the statistics are up to date, you could try your queries again and see if there is improvement.
Something else is that for the combined OR
statement, SQL Server is using an index scan, instead of a seek. You could try the FORCESEEK hint and see if that makes a difference:
SELECT customer_id
FROM customer WITH (FORCESEEK)
WHERE CONTAINS(customer.*, 'nomatch')
OR customer.customer_id = 0;
One other option, as you mentioned, is to split the statements. The following UNION
performs just as well as your first two statements combined:
SELECT customer_id FROM customer
WHERE CONTAINS(customer.*, 'nomatch')
UNION
SELECT customer_id FROM customer
WHERE customer.customer_id = 0
Update - changed above query to UNION
instead of UNION ALL
.
As @PondLife indicated in the comments, I meant to do a UNION
in the above query instead of UNION ALL
. After thinking about it, I also tried with UNION ALL
and it seemed to be faster. This assumes you don't care about the duplicate IDs:
SELECT customer_id FROM customer
WHERE CONTAINS(customer.*, 'nomatch')
UNION ALL
SELECT customer_id FROM customer
WHERE customer.customer_id = 0
The "OR" logicial condition often makes queries run very slowly : / Often, the best option is to use UNION (ALL).
In your case, I'm quite curious about the usage you make of
SELECT
customer_id
FROM
customer
WHERE
customer.customer_id = 0;
It would only result in a list (maybe empty) of zeros. Is it to count (!) how many customer have an id = 0? Is it to check if any customer has an id of 0?
If it's not to count the zeroes but to know if they are any, then this query should be efficient:
SELECT
customer_id
FROM
customer
WHERE
CONTAINS(customer.*, 'nomatch')
AND customer.customer_id <> 0
UNION ALL
SELECT TOP(1)
0
FROM
customer
WHERE
customer.customer_id = 0
otherwise the efficient query is this one:
SELECT
customer_id
FROM
customer
WHERE
CONTAINS(customer.*, 'nomatch')
AND customer.customer_id <> 0
UNION ALL
SELECT
0
FROM
customer
WHERE
customer.customer_id = 0
(I just removed the TOP clause)
Depending on your MS SQL 2008 R2 sevice pack version, your problem might be related to following Microsoft Connect issue: Full-text performance with "mixed queries"
As per the MS Connect entry, the problem should go away after installing the latest Cumulative Update package for SQL Server 2008 R2.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With