Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

LIKE work-around in SQL (Performance issues)

I've been reading around and found that using LIKE causes a big slowdown in queries.

A workmate recommended we use

Select Name
From mytable
a.Name IN (SELECT Name 
           FROM mytable
           WHERE Name LIKE '%' + ISNULL(@Name, N'') + '%' 
           GROUP BY Name)

in lieu of

Select Name
From mytable
a.Name LIKE '%' + ISNULL(@Name, N'') + '%'

Now I'm no SQL expert and I don't really understand the inner workings of these statements. Is this a better option worth the effort of typing a few extra characters with each like statement? Is there an even better (and easier to type) alternative?

like image 740
Jonn Avatar asked Sep 11 '10 13:09

Jonn


People also ask

Why is SQL like slow?

The SQL LIKE operator very often causes unexpected performance behavior because some search terms prevent efficient index usage. That means that there are search terms that can be indexed very well, but others can not. It is the position of the wild card characters that makes all the difference.

Is like faster than in SQL?

1 Answer. Using '=' operator is faster than the LIKE operator in comparing strings because '=' operator compares the entire string but the LIKE keyword compares by each character of the string. We can use LIKE to check a particular pattern like column values starting with 'abc' in this case.


2 Answers

There are a couple of performance issues to address...

Don't Access the Same Table More Than Once, If Possible

Don't use a subquery for criteria that can be done without the need for referencing additional copies of the same table. It's acceptable if you need data from a copy of the table due to using aggregate functions (MAX, MIN, etc), though analytic functions (ROW_NUMBER, RANK, etc) might be more accommodating (assuming supported).

Don't Compare What You Don't Need To

If your parameter is NULL, and that means that you want any value for the columns you are comparing against, don't include filtration criteria. Statements like these:

WHERE a.Name LIKE '%' + ISNULL(@Name, N'') + '%'

...guarantee the optimizer will have to compare values for the name column, wildcarding or not. Worse still in the case with LIKE is that wildcarding the left side of the evaluation ensures that an index can't be used if one is present on the column being searched.

A better performing approach would be:

IF @Name IS NOT NULL 
BEGIN
   SELECT ...
     FROM ...
    WHERE a.name LIKE '%' + @Name + '%'
END
ELSE 
BEGIN
   SELECT ...
     FROM ...
END

Well performing SQL is all about tailoring to exactly what you need. Which is why you should be considering dynamic SQL when you have queries with two or more independent criteria.

Use The Right Tool

The LIKE operator isn't very efficient at searching text when you're checking for the existence of a string within text data. Full Text Search (FTS) technology was designed to address the shortcomings:

IF @Name IS NOT NULL
BEGIN
   SELECT ...
     FROM ...
    WHERE CONTAINS(a.name, @Name) 
END
ELSE
BEGIN
   SELECT ...
     FROM ...
END

Always Test & Compare

I agree with LittleBobbyTables - the solution ultimately relies on checking the query/execution plan for all the alternatives because table design & data can impact optimizer decision & performance. In SQL Server, the one with the lowest subtreecost is the most efficient, but it can change over time if the table statistics and indexes aren't maintained.

like image 63
OMG Ponies Avatar answered Sep 20 '22 10:09

OMG Ponies


Simply compare the execution plans and you should see the difference.

I don't have your exact data, but I ran the following queries against a SQL Server 2005 database of mine (yes, it's nerdy):

SELECT     UnitName
FROM         Units
WHERE     (UnitName LIKE '%Space Marine%')

SELECT     UnitName
FROM         Units
WHERE     UnitName IN (
   (SELECT UnitName FROM Units 
   WHERE UnitName LIKE '%Space Marine%' GROUP BY UnitName)
)

Here were my execution plan results:

alt text

Your co-worker's suggestion adds a nested loop and a second clustered index scan to my query as you can see above. Your mileage may vary, but definitely check the execution plans to see how they compare. I can't imagine how it would be more efficient.

like image 29
LittleBobbyTables - Au Revoir Avatar answered Sep 20 '22 10:09

LittleBobbyTables - Au Revoir