I am using MySQL. I have asked a question about how to query in database for a single word match here.
There is an answer which suggest me to use REGEXP '[[:<:]]word[[:>:]]'
It is a good answer, however, I am not sure how is this REGEXP '[[:<:]]word[[:>:]]'
thing from performance perspective? If I have a large table, is this way harm the performance of my application?
For example, compare with =
operation, e.g. WHERE column_name='value'
, is the REGEXP
operation far more slow than =
for large table?
There is another answer which suggested me to use LIKE, but I think it is not good from performance point of view.
Then, I googled and found an article which says use LIKE
is even faster than REGEXP
. I get confused, which way I should use for a single word match query in a large table...
Can I say, =
is the fastest operation, then LIKE
, and REGEXP
is the poorest one from performance perspective?
Also LIKE is much faster than REGEXP.
While neither of these operations (suffix matching) could leverage table indices, the LIKE operator ran 5-6 times faster than the RLIKE / REGEXP operator to find the same data.
My experience shows that most of the time developers focus on correctness of a regex, leaving aside its performance. Yet matching a string with a regex can be surprisingly slow. So slow it can even stop any JS app or take 100% of a server CPU time causing denial of service (DOS).
Basically, LIKE does very simple wildcard matches, and REGEX is capable of very complicated wildcard matches. In fact, regular expressions ( REGEX ) are so capable that they are [1] a whole study in themselves [2] an easy way to introduce very subtle bugs.
Regarding regexp
The regexp can never use an index in MySQL.
The =
will use an index if:
Considering these and some other more esoteric caveats an =
comparison is much faster than a regexp.
Regarding like
LIKE can use an index if the wildcard is not the first char.
SELECT * FROM t WHERE a LIKE 'abc' <<-- (case insensitive `=`) can use an index
SELECT * FROM t WHERE a LIKE 'abc%' <<-- can use an index
SELECT * FROM t WHERE a LIKE 'a%' <<-- can use an index, depending on cardinality
SELECT * FROM t WHERE a LIKE '%a%' <<-- cannot use an index
SELECT * FROM t WHERE a LIKE '_agf' <<-- cannot use an index
The performance of like
when using an index is very close to =
(assuming the same number of rows returned).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With