Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to merge regex group matches?

Tags:

regex

Let's say I have the line below:

one two three

Is it possible to write a regex that would return below?

one three

I can of course get each part in a separate group but is it possible to capture that in a single match?

like image 511
Milad Avatar asked Aug 05 '16 11:08

Milad


People also ask

How do Capturing groups work in regex?

Capturing groups are a way to treat multiple characters as a single unit. They are created by placing the characters to be grouped inside a set of parentheses. For example, the regular expression (dog) creates a single group containing the letters "d" "o" and "g" .

What is the difference between a match and group in regex?

A Match is an object that indicates a particular regular expression matched (a portion of) the target text. A Group indicates a portion of a match, if the original regular expression contained group markers (basically a pattern in parentheses).

How do you cite a group in regex?

Parentheses group the regex between them. They capture the text matched by the regex inside them into a numbered group that can be reused with a numbered backreference. They allow you to apply regex operators to the entire grouped regex. (abc){3} matches abcabcabc.

How do you match expressions in regex?

To match a character having special meaning in regex, you need to use a escape sequence prefix with a backslash ( \ ). E.g., \. matches "." ; regex \+ matches "+" ; and regex \( matches "(" . You also need to use regex \\ to match "\" (back-slash).


1 Answers

To put it simply: no, it can't be done (as discussed in comments on your original question).

To find out why, let's look at it a bit more generally. A regular expression can be modelled as a (often complex) deterministic finite automaton, also known as a DFA, and your average regex engine is implemented as one. What this means is that the regex will slurp zero or one character at a time, and see if it matches the current token. If not, it will backtrack and attempt to match any possible token at the current stage (done with the alternation operation |). If unable, it halts and reports it cannot match. Since a DFA operates on the input in sequential order, what you're asking for is basically impossible by definition.

like image 90
Sebastian Lenartowicz Avatar answered Oct 11 '22 04:10

Sebastian Lenartowicz