Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Positive Lookahead Regex

Tags:

regex

I have the following regex:

^(?=.{8}$).+

The way I understand this is it will accept 8 of any type of character, followed by 1 or more of any character. I feel I am not grasping how a Positive Lookahead works. Because both sections of the Regex are looking for '.' wouldn't any series of characters fit this?

My question is, how does the positive lookahead effect this regex and what is an example of a matching string?

The following did not match when supplied in the following regex tool:

  • 123456781
  • (12345678)1
  • (12345678)
  • (abcdefgh)a
  • (abcdefgh)
  • abc
  • 123

EDIT: Removed first two data entries as I clearly wasn't using the regex tool correctly as they now match with exactly 8 characters.

like image 833
Nashibukasan Avatar asked Apr 24 '13 01:04

Nashibukasan


People also ask

What is positive lookahead in regex?

The positive lookahead construct is a pair of parentheses, with the opening parenthesis followed by a question mark and an equals sign. You can use any regular expression inside the lookahead (but not lookbehind, as explained below). Any valid regular expression can be used inside the lookahead.

What is a lookahead in regex?

Lookahead is used as an assertion in Python regular expressions to determine success or failure whether the pattern is ahead i.e to the right of the parser's current position. They don't match anything. Hence, they are called as zero-width assertions. Syntax: # Positive lookahead (?=<lookahead_regex>)

What is lookaround in regex?

Zero-Width Matches As we've seen, a lookaround looks left or right but it doesn't add any characters to the match to be returned by the regex engine. Likewise, an anchor such as ^ and a boundary such as \b can match at a given position in the string, but they do not add any characters to the match.

Can I use regex lookahead?

Lookahead assertions are part of JavaScript's original regular expression support and are thus supported in all browsers.


2 Answers

^(?=.{8}$).+

will match the string

aaaaaaaa

Reasoning:

The content inside of the brackets is a lookahead, since it starts with ?=.

The content inside of a lookahead is parsed - it is not interpreted literally.

Thus, the lookahead only allows the regex to match if .{8}$ would match (at the start of the string, in this case). So the string has to be exactly eight characters then it has to end, as evidenced by $.

Then .+ will match those eight characters.

like image 67
Patashu Avatar answered Nov 09 '22 23:11

Patashu


It is trying to match:

^               # start of line, but...
(?=.{8}$)       # only if it precedes exactly 8 characters and the end of line
.+              # this one matches those 8 characters

and from your input, it should also match these (try this engine with match at line breaks checked):

12345678
abcdefgh
like image 32
perreal Avatar answered Nov 09 '22 22:11

perreal