Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What are the zero width elements in a regular expression?

Tags:

regex

Recently, I have been seeing "zero width elements" in regular expressions. What are they? Can they be treated as ghost data, so that for replacement, they won't be replaced, and for ( ) matching, they won't go into the matches[1], matches[2], etc?

Is there a good tutorial for all its various uses? Have they been here for a long time? Which version of O'Reilly's Regular Expression book was the first to discuss them?

like image 261
nonopolarity Avatar asked Nov 25 '10 20:11

nonopolarity


People also ask

Which character will be used for zero or more occurrences in regular expression?

A regular expression followed by an asterisk ( * ) matches zero or more occurrences of the regular expression. If there is any choice, the first matching string in a line is used.

What is zero length match regex?

A zero-width or zero-length match is a regular expression match that does not match any characters. It matches only a position in the string.

What is the use of \\ w in regex?

In regex, the uppercase metacharacter denotes the inverse of the lowercase counterpart, for example, \w for word character and \W for non-word character; \d for digit and \D or non-digit.

What does \\ mean in regex?

\\. matches the literal character . . the first backslash is interpreted as an escape character by the Emacs string reader, which combined with the second backslash, inserts a literal backslash character into the string being read. the regular expression engine receives the string \. html?\ ' .


1 Answers

The point of zero-width lookaround assertions is that they check if a certain regex can or cannot be matched looking forward or backwards from the current position, without actually adding them to the match. So, yes, they won't count towards the capturing groups, and yes, their matches won't be replaced (because they aren't matched in the first place).

However, you can have a capturing group inside a lookaround assertion that will go into matches[1] etc.

For example, in C#:

Regex.Replace("ab", "(a)(?=(b))", "$1$2");

will return abb.

A very good online tutorial about regular expressions in general can be found at http://www.regular-expressions.info (even though it's a little out of date in some areas).

It contains a specific section about zero-width lookaround assertions (and Part II).

And of course they are covered in-depth in both Mastering Regular Expressions and the Regular Expressions Cookbook.

like image 192
Tim Pietzcker Avatar answered Sep 17 '22 15:09

Tim Pietzcker