Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Retrieve text inside #{ }

Tags:

python

regex

I have the following text:

#{king} for a ##{day}, ##{fool} for a #{lifetime}

And the following (broken) regex:

[^#]#{[a-z]+}

I want to match all #{words} but not the ##{words} (Doubling '#' acts like escaping) .

Today I've noticed that the regex I have is ignoring the first word (refuses to match #{king}, but correctly ignores ##{day} and ##{fool}) .

>>> regex = re.compile("[^#]#{[a-z]+}")
>>> regex.findall(string)
[u' #{lifetime}']

Any suggestions on how to improve the current regex in order to suit my needs ? I guess the problem is with [^#] ...

like image 661
Andrei Ciobanu Avatar asked Aug 19 '11 11:08

Andrei Ciobanu


1 Answers

You have to use a "negative lookbehind assertion", the correct regex would look like this:

import re
t = "#{king} for a ##{day}, ##{fool} for a #{lifetime}"
re.findall(r'(?<!#)#{([a-z]+)}', t)

returns

['king', 'lifetime']

Explanation:

The (?<!prefix)pattern expression matches pattern only if it's not preceeded by prefix.

like image 165
mdeous Avatar answered Oct 08 '22 17:10

mdeous