Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Meaning of question mark and x in a regex group

I'm trying to learn Atom's syntax highlighting/grammar rules, which heavily use JS regular expressions, and came across an unfamiliar pattern in the python grammar file.

The pattern starts with a (?x) which is an unfamiliar regex to me. I looked it up in an online regex tester, which seems to say that it's invalid. My initial thought was it represents an optional left paren, but I believe the paren should be escaped here.

Does this only have meaning in the Atom's coffeescript grammar, or am I overlooking a regex meaning?

(This pattern also appear in the textmate language file that I believe Atom's came from).

like image 783
beardc Avatar asked Sep 25 '15 13:09

beardc


People also ask

What does a question mark mean in regex?

A regular expression followed by a question mark (?) matches zero or one occurrences of the regular expression. Two regular expressions concatenated match an occurrence of the first followed by an occurrence of the second.

What does X mean in regex?

Most characters, including all letters ( a-z and A-Z ) and digits ( 0-9 ), match itself. For example, the regex x matches substring "x" ; z matches "z" ; and 9 matches "9" . Non-alphanumeric characters without special meaning in regex also matches itself.

How do you match a question mark in regex?

But if you want to search a question mark, you need to “escape” the regex interpretation of the question mark. You accomplish this by putting a backslash just before the quesetion mark, like this: \? If you want to match the period character, escape it by adding a backslash before it.

What is question mark in regex JS?

The question mark gives the regex engine two choices: try to match the part the question mark applies to, or do not try to match it. The engine always tries to match that part. Only if this causes the entire regular expression to fail, will the engine try ignoring the part the question mark applies to.


1 Answers

If that regular expression gets processed in Python, it'll be compiled with the 'verbose' flag.

From the Python re docs:

(?aiLmsux)

(One or more letters from the set 'a', 'i', 'L', 'm', 's', 'u', 'x'.) The group matches the empty string; the letters set the corresponding flags: re.A (ASCII-only matching), re.I (ignore case), re.L (locale dependent), re.M (multi-line), re.S (dot matches all), and re.X (verbose), for the entire regular expression. (The flags are described in Module Contents.) This is useful if you wish to include the flags as part of the regular expression, instead of passing a flag argument to the re.compile() function.

like image 199
Bryant Avatar answered Sep 28 '22 10:09

Bryant