I have a expression like c.{0,2}?m
and a string like "abcemtcmncefmf"
. Currently it will matches three substrings: cem
, cm
and cefm
(see here). But I like to match only the smallest of this, in this case, cm
.
My problem is that I don't have a global match support, only the first match, because I'm using MariaDB REGEXP_SUBSTR()
function. My current solution is a stored procedure that I created to solve my problem. But it is 10 times slower than just a regular expression for simple cases.
I too tried do something like: (cm|c.{0,1}?m|c.{0,2}?m)
, but it doesn't worked because it will match first of any group patterns, instead of try one by one in all subject string.
I know that regular expressions (PCRE) have some black magic features, but I don't found nothing to solve my problem.
.{0,2}?
) on my current pattern;[] denotes a character class. () denotes a capturing group. [a-z0-9] -- One character that is in the range of a-z OR 0-9. (a-z0-9) -- Explicit capture of a-z0-9 .
To match a character having special meaning in regex, you need to use a escape sequence prefix with a backslash ( \ ). E.g., \. matches "." ; regex \+ matches "+" ; and regex \( matches "(" . You also need to use regex \\ to match "\" (back-slash).
Use square brackets [] to match any characters in a set. Use \w to match any single alphanumeric character: 0-9 , a-z , A-Z , and _ (underscore). Use \d to match any single digit. Use \s to match any single whitespace character.
The most common regex character to find whitespaces are \s and \s+ . The difference between these regex characters is that \s represents a single whitespace character while \s+ represents multiple whitespaces in a string.
You can simply use an alternation in a branch reset group:
/^(?|.*(cm)|.*(c.m)|.*(c..m))/s
(The result is in group 1)
or like this:
/^.*\Kcm|^.*\Kc.m|^.*\Kc..m/s
The first successful branch wins.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With