Regex, find pattern only in middle of string

Question

I am using python 2.6 and trying to find a bunch of repeating characters in a string, let's say a bunch of n's, e.g. nnnnnnnABCnnnnnnnnnDEF. In any place of the string the number of n's can be variable.

If I construct a regex like this:

re.findall(r'^(((?i)n)\2{2,})', s),

I can find occurences of case-insensitive n's only in the beginning of the string, which is fine. If I do it like this:

re.findall(r'(((?i)n)\2{2,}$)', s),

I can detect the ones only in the end of the sequence. But what about just in the middle?

At first, I thought of using re.findall(r'(((?i)n)\2{2,})', s) and the two previous regex(-ices?) to check the length of the returned list and the presence of n's either in the beginning or end of the string and make logical tests, but it became an ugly if-else mess very quickly.

Then, I tried re.findall(r'(?!^)(((?i)n)\2{2,})', s), which seems to exlude the beginning just fine but (?!$) or (?!\z) at the end of the regex only excludes the last n in ABCnnnn. Finally, I tried re.findall(r'(?!^)(((?i)n)\2{2,})\w+', s) which seems to work sometimes, but I get weird results at others. It feels like I need a lookahead or lookbehind, but I can't wrap my head around them.

Mazdak · Accepted Answer

Instead of using a complicated regex in order to refuse of matching the leading and trailing n characters. As a more pythonic approach you can strip() your string then find all the sequence of ns using re.findall() and a simple regex:

>>> s = "nnnABCnnnnDEFnnnnnGHInnnnnn" 
>>> import re
>>> 
>>> re.findall(r'n{2,}', s.strip('n'), re.I)
['nnnn', 'nnnnn']

Note : re.I is Ignore-case flag which makes the regex engine matches upper case and lower case characters.

Casimir et Hippolyte · Answer

Since "n" is a character (and not a subpattern), you can simply use:

re.findall(r'(?<=[^n])nn+(?=[^n])(?i)', s)

or better:

re.findall(r'n(?<=[^n]n)n+(?=[^n])(?i)', s)

Regex, find pattern only in middle of string

Tags:

python

regex

Dima1982

2 Answers

Mazdak

Casimir et Hippolyte

Recent Activity

Donate For Us

Regex, find pattern only in middle of string

Tags:

python

regex

Dima1982

2 Answers

Mazdak

Casimir et Hippolyte

Related questions

Recent Activity

Donate For Us