python: regex only gets the last occurrence

Question

#!/usr/bin/env python
# -*- coding: utf-8 -*-

import re

text = "aaaa[ab][cd][ef]"

a = re.compile("^(\w+)($$\w+$$)*$").findall(text)

print a

i need all of them but it returns:

[('aaaa', '[ef]')]

with:

a = re.compile("$$\w+$$").findall(text)

i get all of them but the first word is out...

['[ab]', '[cd]', '[ef]']

this text is random text i put this because of the stackoverflow standars quality

NPE · Accepted Answer

Here is how you can do it:

In [14]: a = re.compile(r"(\w+|$$\w+$$)").findall(text)

In [15]: print a
['aaaa', '[ab]', '[cd]', '[ef]']

Each match returns one group of letters (with or without brackets).

cvoinescu · Answer

There is only one match: the "^(\w+)" part matches "aaaa" and the "(\[\w+\])*$" part matches "[ab][cd][ef]". Note that you get a list of one element (which is a tuple), so there's only one match. Each pair of parentheses you use in the regexp generates an element in the tuple, with the text that matched whatever was inside them. There are two pairs, so there are two elements in the tuple. The second pair of parentheses is starred, but that only causes that result to be "assigned" multiple times (which appears to keep the last value): it does not multiply the parentheses themselves, so you don't get a larger tuple.

I'm not sure what you expect, so I don't know what regexp to suggest.

Andrew Clark · Answer

Based on your comment on aix's answer it appears that you want to require the non-bracketed part to match, maybe something like this is what you are looking for?

>>> a = re.compile(r"^(\w+)((?:$$\w+$$)*)").findall(text)
>>> print a
[('aaaa', '[ab][cd][ef]')]

If you need to get the result ['aaaa', '[ab]', '[cd]', '[ef]'] instead of what is shown above here is one method:

>>> match = re.compile(r"^(\w+)((?:$$\w+$$)*)").search(text)
>>> a = [match.group(1)] + match.group(2).replace("][", "] [").split()
>>> print a
['aaaa', '[ab]', '[cd]', '[ef]']

python: regex only gets the last occurrence

Tags:

python

regex

ZiTAL

3 Answers

NPE

cvoinescu

Andrew Clark

Recent Activity

Donate For Us

python: regex only gets the last occurrence

Tags:

python

regex

ZiTAL

3 Answers

NPE

cvoinescu

Andrew Clark

Related questions

Recent Activity

Donate For Us