I came across a regex like the following:
foo(?!.*foo)
if it is fed with foo bar bar foo
, it will find the last occurrence of foo. I know it uses a mechanism called negative lookahead which means it will match a word which not end with characters after the ?!. But how does the regex here works?
A regex pattern matches a target string. The pattern is composed of a sequence of atoms. An atom is a single point within the regex pattern which it tries to match to the target string. The simplest atom is a literal, but grouping parts of the pattern to match an atom will require using ( ) as metacharacters.
$ means "Match the end of the string" (the position after the last character in the string).
To count a regex pattern multiple times in a given string, use the method len(re. findall(pattern, string)) that returns the number of matching substrings or len([*re. finditer(pattern, text)]) that unpacks all matching substrings into a list and returns the length of it as well.
Method 1: Match everything after first occurence Whitespace characters include spaces, tabs, linebreaks, etc. while non-whitespace characters include all letters, numbers, and punctuation. So essentially, the \s\S combination matches everything.
Slightly different answer from sshashank (because the word containing
in his answer doesn't work for me and in regex you have to be pedantic—it's all about precision.) I'm 100% sure sshashank knows this and only phrased it that way for brevity.
The regex matches foo
, not followed (i.e., negative lookahead (?!
) by this:
{{{any number of any characters (i.e., .*
) then the characters foo
}}}
If the lookahead fails, the portion corresponding to .*
does not contain foo
. foo
comes later.
See this automatic translation:
NODE EXPLANATION
--------------------------------------------------------------------------------
foo 'foo'
--------------------------------------------------------------------------------
(?! look ahead to see if there is not:
--------------------------------------------------------------------------------
.* any character except \n (0 or more times
(matching the most amount possible))
--------------------------------------------------------------------------------
foo 'foo'
--------------------------------------------------------------------------------
) end of look-ahead
The same in different words from regex101:
/foo(?!.*foo)/
foo matches the characters foo literally (case sensitive) (?!.*foo) Negative Lookahead - Assert that it is impossible to match the regex below .* matches any character (except newline) Quantifier: Between zero and unlimited times, as many times as possible, giving back as needed [greedy] foo matches the characters foo literally (case sensitive)
What does RegexBuddy have to say?
foo(?!.*foo)
foo
(?!.*foo)
.*
*
foo
It matches foo
only if it is not followed (?!
) by any more text (.*
) containing foo
in it.
Negative lookahead is essential if you want to match something not followed by something else.
Short explanation:
foo(?!.*foo) matches foo when not followed by any character except \n and `foo`
For example, say you have the following two strings.
foobar
barfoo
And the regular expression:
foo(?!bar)
This matches foo
when not followed by bar so it would match the string barfoo
here.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With