How does one find the currency value in a string?

Question

I'm writing a small tool to extract a bunch of values from a string (usually a tweet).

The string could consist of words and numbers along with an amount prefixed by a currency symbol (£,$,€ etc.) and a number of hashtags (#foo #bar). I'm running on appEngine and using tweepy to bring in the tweets.

The current code I have to find the values is below:

tagex = re.compile(r'#.*')
curex = re.compile(ur'[£].*')
for x in api.user_timeline(since_id = t.lastimport):
          tags = re.findall(tagex, x.text)
          amount = re.findall(curex, x.text)[0]
          logging.info("Text: " + x.text)
          logging.info("Tags: " + str(tags))
          logging.info("Amount: " + amount)

where x.text is for example "Taxi London £6.50 #projectfoo #clientmeeting"

The tagex finds the hashtags fine, but I can't get curex to extract the amount currently I get: Amount: £6.50 #projectfoo #clientmeeting.

I also need to separate off the currency symbol so as to get the amount as a float, but that should be pretty simple later.

moinudin · Accepted Answer

>>> re.search(ur'([£$€])(\d+(?:\.\d{2})?)', s).groups()
(u'\xa3', u'6.50')

[£$€] matches one currency symbol
\d+(?:\.\d{2}) matches one or more digits followed by an optional decimal point followed by exactly two digits
The ()'s capture the symbol and amount separately

The problem with your regex is that .* matches anything and is greedy, so at the end of a regex it matches everything that follows.

How does one find the currency value in a string?

Tags:

python

regex

currency

Sam Machin

1 Answers

moinudin

Recent Activity

Donate For Us

How does one find the currency value in a string?

Tags:

python

regex

currency

Sam Machin

1 Answers

moinudin

Related questions

Recent Activity

Donate For Us