Python Regular Expression with optional but greedy groups

Question

I'm trying to write a regular expression to match a string that may or may not contain two tags. I need the expression to return me all five elements of the string, depending on whether they exist, but when I make the tags optional, the wildcard bits seem to gobble them up:

Inputs could be:

text{a}more{b}words  
{a}text{b}test  
text  
text{b}text  
text{b}  
text{a}text

Et cetera. The only thing guaranteed is that <a> will always be before <b>, provided they exist.

My expression now looks as follows:

^(.*?)(\{a\})?(.*?)(\{b\})?(.*?)$

Unfortunately, this ends up throwing all text into the last group, regardless of whether or not the tags are present. Is there some way to make them greedy, yet keep them optional? re.findall doesn't seem to help either unfortunately.

Any help would be greatly appreciated! :)

Andrew Clark · Accepted Answer

Try the following regex: ^(.*(?={a})|.*?)({a})?(.*(?={b})|.*)({b})?(.*?)$

import re

inputs = ['{a}text{b}test', 'text', 'text{b}text', 'text{b}', 'text{a}text']
p = re.compile(r"^(.*(?={a})|.*?)({a})?(.*(?={b})|.*)({b})?(.*?)$")
for input in inputs:
    print p.match(input).groups()

Output:

('', '{a}', 'text', '{b}', 'test')
('', None, 'text', None, '')
('', None, 'text', '{b}', 'text')
('', None, 'text', '{b}', '')
('text', '{a}', 'text', None, '')

Python Regular Expression with optional but greedy groups

Tags:

python

regex

kmh

1 Answers

Andrew Clark

Recent Activity

Donate For Us

Python Regular Expression with optional but greedy groups

Tags:

python

regex

kmh

1 Answers

Andrew Clark

Related questions

Recent Activity

Donate For Us