I have a db query that'll cause a full table scan using a like clause and came upon a question I was curious about...
Which of the following should run faster in Mysql or would they both run at the same speed? Benchmarking might answer it in my case, but I'd like to know the why of the answer. The column being filtered contains a couple thousand characters if that's important.
SELECT * FROM users WHERE data LIKE '%=12345%'
or
SELECT * FROM users WHERE data LIKE '%proileId=12345%'
I can come up for reasons why each of these might out perform the other, but I'm curious to know the logic.
1 Answer. Using '=' operator is faster than the LIKE operator in comparing strings because '=' operator compares the entire string but the LIKE keyword compares by each character of the string. We can use LIKE to check a particular pattern like column values starting with 'abc' in this case.
Answer. Answer: In SQL, the LIKE keyword is used to search for patterns. Pattern matching employs wildcard characters to match different combinations of characters.
All things being equal, longer match strings should run faster since it allows to skip through the test strings with bigger steps and do less matches.
For an example of the algorithms behind sting matching see for example Boyer Moore Algorithm on Wikipedia.
Of course not all things are equal, so I would definitely benchmark it.
A quick check found in the mysql reference docs the following paragraph :
If you use ... LIKE '%string%' and string is longer than three characters, MySQL uses the Turbo Boyer-Moore algorithm to initialize the pattern for the string and then uses this pattern to perform the search more quickly.
No difference whatsoever. Because you've got a % sign at the beginning of your LIKE expression, that completely rules out the use of indexes, which can only be used to match the a prefix of the string.
So it will be a full table scan either way.
In a significant sized database (i.e. one which doesn't fit in ram on your 32G server), IO is the biggest cost by a very large margin, so I'm afraid the string pattern-matching algorithm will not be relevant.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With