Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

About question mark in regular expression

Tags:

regex

I saw a regular expression says (?i). So what does it mean when we put a question mark in front of a character?

like image 633
Guoxing Li Avatar asked Oct 28 '11 21:10

Guoxing Li


People also ask

What is question mark in python regex?

The question mark quantifier indicates that you want to match either one or zero occurrences of this pattern. The second part of the regex [cde]? defines a character class [cde] which reads as “match either c , d , or e “. Again, the question mark indicates the zero-or-one matching requirement.

What is the meaning of asterisk (*) in regular expression?

The asterisk ( * ): The asterisk is known as a repeater symbol, meaning the preceding character can be found 0 or more times. For example, the regular expression ca*t will match the strings ct, cat, caat, caaat, etc.

What is '?' In regular expression?

'?' is also a quantifier. Is short for {0,1}. It means "Match zero or one of the group preceding this question mark." It can also be interpreted as the part preceding the question mark is optional. e.g.: pattern = re.compile(r'(\d{2}-)?\


1 Answers

In general it does not mean anything and might even result in an error (if the question mark does not follow a valid character). But there are certain characters where it does have an effect, namely if this character is also used as modifier.

regular-expressions.info says about this particular syntax:

Modern regex flavors allow you to apply modifiers to only part of the regular expression. If you insert the modifier (?ism) in the middle of the regex, the modifier only applies to the part of the regex to the right of the modifier. You can turn off modes by preceding them with a minus sign. All modes after the minus sign will be turned off. E.g. (?i-sm) turns on case insensitivity, and turns off both single-line mode and multi-line mode.

Not all regex flavors support this. JavaScript and Python apply all mode modifiers to the entire regular expression. They don't support the (?-ismx) syntax, since turning off an option is pointless when mode modifiers apply to the whole regular expressions. All options are off by default.

You can quickly test how the regex flavor you're using handles mode modifiers. The regex (?i)te(?-i)st should match test and TEst, but not teST or TEST.

?i means that everything following these characters should be matched case-insensitive.

Also note that, as the text says, not all regex flavors support this syntax.

like image 82
Felix Kling Avatar answered Sep 20 '22 09:09

Felix Kling