I want to find all connected words except for specific ones. For example:
0827banana82/+wine22green-729
green
and wine
should match, but banana
not.
I tried the following regular expression with a negative lookahead:
(?!banana)([a-zA-Z]+)
but it excludes only the first letter of banana
because anana
is still a match for the second pattern. I have no idea how to get rid of that.
If you want to exclude a certain word/string in a search pattern, a good way to do this is regular expression assertion function. It is indispensable if you want to match something not followed by something else. ?= is positive lookahead and ?! is negative lookahead.
The metacharacter \b is an anchor like the caret and the dollar sign. It matches at a position that is called a “word boundary”. This match is zero-length. There are three different positions that qualify as word boundaries: Before the first character in the string, if the first character is a word character.
Example: The regex "aa\n" tries to match two consecutive "a"s at the end of a line, inclusive the newline character itself. Example: "a\+" matches "a+" and not a series of one or "a"s. ^ the caret is the anchor for the start of the string, or the negation symbol.
Regular Expression: exclude a word/string. If you want to exclude a certain word/string in a search pattern, a good way to do this is regular expression assertion function. It is indispensable if you want to match something not followed by something else.
Regex Match All Except a Specific Word, Character, or Pattern December 30, 2020 by Benjamin Regex is great for finding specific patterns, but can also be useful to match everything except an unwanted pattern. A regular expression that matches everything except a specific pattern or word makes use of a negative lookahead.
Use the below regex to avoid one single word. Using [banana] does not do what you think that is does. It is a character class matching one of the listed characters and is the same as [bna] You need to use the word boundary \b expression \b. Banana apple will be excluded from your match.
If the character you want to exclude is a reserved character in regex (such as ? or *) you need to include a backslash \ in front of the character to escape it, as shown: /^(?!.*\?).*/
You may add a negative lookbehind in your regex to make it work:
(?!banana)(?<![a-zA-Z])[a-zA-Z]+
RegEx Demo
RegEx Details:
(?!banana)
: Negative lookahead to assert that we don't have string banana
ahead of the current position(?<![a-zA-Z])
: Negative lookbehind to assert that we don't have a letter before current position[a-zA-Z]+
: Match 1+ lettersPS: If you want to allow words like bananas
then use:
(?!banana(?![a-zA-Z]))(?<![a-zA-Z])[a-zA-Z]+
Well you can use this one:
(banana)|([a-zA-Z]+)
Which will capture banana in 1st group and all the other words in 2nd.
Another variation might be matching the characters a-zA-Z until there are no more. Then assert that banana is not directly to the left.
[a-zA-Z]+(?![a-zA-Z])(?<!banana)
The pattern matches
[a-zA-Z]+
Match 1+ chars a-zA-Z(?![a-zA-Z])
Negative lookahead, assert not a-zA-Z directly to the right(?<!banana)
Negative lookbehind, assert banana
not directly to the leftRegex demo
If you want to match bananas
or straigtbanana
you can assert that on the left is not banana preceded by a char a-zA-Z
[a-zA-Z]+(?![a-zA-Z])(?<!(?<![a-zA-Z])banana)
Regex demo
As suggested by @bobble bubble in the comments, if possessive quantifiers are supported and shortening the pattern using a case insensitive match:
[a-z]++(?<!(?<![a-z])banana)
[a-z]++
Match 1+ chars in the range of a-z (possessive, do not backtrack)(?<!
Negative lookbehind, assert what is directly to the left is not
(?<![a-z])banana
Negative lookbehind, match banana not preceded by a-z)
Close lookbedhindRegex demo
My two cents, assuming you do want to match words like "bananas":
(\b|\d)(?:banana|([a-zA-Z]+))(?1)
Your matches are in group 2, see an online demo
(\b|\d)
- A 1st capture group to hold a word-boundary or a digit.(?:banana|([a-zA-Z]+))
- A non-capture group with the alternation of either exactly "banana" or a 2nd capture group of 1+ alpha-chars.(?1)
- Repeat the subpattern of the 1st capture group.EDIT: If the backreference is not supported, you can try
(?:\b|\d)(?:banana|([a-zA-Z]+))(?:\b|\d)
Or, using lookarounds:
(?i)(?<![a-z])(?:banana|([a-z]+))(?![a-z])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With