Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regular expression that doesn't contain certain string [duplicate]

I have something like this

aabbabcaabda

for selecting minimal group wrapped by a I have this /a([^a]*)a/ which works just fine

But i have problem with groups wrapped by aa, where I'd need something like /aa([^aa]*)aa/ which doesn't work, and I can't use the first one like /aa([^a]*)aa/, because it would end on first occurence of a, which I don't want.

Generally, is there any way, how to say not contains string in the same way that I can say not contains character with [^a]?

Simply said, I need aa followed by any character except sequence aa and then ends with aa

like image 778
Jakub Arnold Avatar asked Apr 04 '09 19:04

Jakub Arnold


People also ask

What does regex (? S match?

i) makes the regex case insensitive. (? s) for "single line mode" makes the dot match all characters, including line breaks.

What is non capturing group in regex?

Non-capturing groups are important constructs within Java Regular Expressions. They create a sub-pattern that functions as a single unit but does not save the matched character sequence. In this tutorial, we'll explore how to use non-capturing groups in Java Regular Expressions.

What is not in regular expression?

NOT REGEXP in MySQL is a negation of the REGEXP operator used for pattern matching. It compares the given pattern in the input string and returns the result, which does not match the patterns. If this operator finds a match, the result is 0.


2 Answers

By the power of Google I found a blogpost from 2007 which gives the following regex that matches string which don't contains a certain substring:

^((?!my string).)*$ 

It works as follows: it looks for zero or more (*) characters (.) which do not begin (?! - negative lookahead) your string and it stipulates that the entire string must be made up of such characters (by using the ^ and $ anchors). Or to put it an other way:

The entire string must be made up of characters which do not begin a given string, which means that the string doesn't contain the given substring.

like image 119
Grey Panther Avatar answered Sep 28 '22 16:09

Grey Panther


In general it's a pain to write a regular expression not containing a particular string. We had to do this for models of computation - you take an NFA, which is easy enough to define, and then reduce it to a regular expression. The expression for things not containing "cat" was about 80 characters long.

Edit: I just finished and yes, it's:

aa([^a] | a[^a])aa 

Here is a very brief tutorial. I found some great ones before, but I can't see them anymore.

like image 20
Claudiu Avatar answered Sep 28 '22 16:09

Claudiu