This query (with different name instead of "jack") happens many times in my slow query log. Why?
The Users table has many fields (more than these three I've selected) and about 40.000 rows.
select name,username,id from Users where ( name REGEXP
'[[:<:]]jack[[:>:]]' ) or ( username REGEXP '[[:<:]]jack[[:>:]]' )
order by name limit 0,5;
id
is primary and autoincrement.name
has an index.username
has a unique index.
Sometimes it takes 3 seconds! If I explain the select on MySQL I've got this:
select type: SIMPLE
table: Users
type: index
possible keys: NULL
key: name
key len: 452
ref: NULL
rows: 5
extra: Using where
Is this the best I can do? What can I fix?
LIKE performance is faster. If you can get away with using it instead of REGEXP , do it. Save this answer.
Yeah, it probably would be a tiny bit faster because standard-SQL LIKE is a simpler comparison operation than a full-on regex parser. However, in real terms both are really slow, because neither can use indices. ( LIKE can use an index if the match string doesn't start with a wildcard, but that's not the case here.)
MySQL supports another type of pattern matching operation based on the regular expressions and the REGEXP operator. It provide a powerful and flexible pattern match that can help us implement power search utilities for our database systems. REGEXP is the operator used when performing regular expression pattern matches.
Basically, LIKE does very simple wildcard matches, and REGEX is capable of very complicated wildcard matches. In fact, regular expressions ( REGEX ) are so capable that they are [1] a whole study in themselves [2] an easy way to introduce very subtle bugs.
If you must use regexp-style WHERE
clauses, you definitely will be plagued by slow-query problems. For regexp-style search to work, MySQL has to compare every value in your name column with the regexp. And, your query has doubled the trouble by also looking at your username column.
This means MySQL can't take advantage of any indexes, which is how all DBMSs speed up queries of large tables.
There are a few things you can try. All of them involve saying goodbye to REGEXP.
One is this:
WHERE name LIKE CONCAT('jack', '%') OR username LIKE CONCAT('jack', '%')
If you create indexes on your name and username columns this should be decently fast. It will look for all names/usernames beginning with 'jack'. NOTICE that
WHERE name LIKE CONCAT('%','jack') /* SLOW!!! */
will look for names ending with 'jack' but will be slow like your regexp-style search.
Another thing you can do is figure out why your application needs to be able to search for part of a name or username. You can either eliminate this feature from your application, or figure out some better way to handle it.
Possible better ways:
All of these involve some programming work.
I reached 50% speedup just by adding fieldname
!= '' in where clause. It makes mysql to use indexes.
SELECT name, username, id
FROM users
WHERE name != ''
AND (name REGEXP '[[:<:]]jack[[:>:]]' or username REGEXP '[[:<:]]jack[[:>:]]')
ORDER BY name
LIMIT 0,5;
Not a perfect solution but helps.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With