Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Flexible numeric string parsing in Python

Are there any Python libraries that help parse and validate numeric strings beyond what is supported by the built-in float() function? For example, in addition to simple numbers (1234.56) and scientific notation (3.2e15), I would like to be able to parse formats like:

  • Numbers with commas: 2,147,483,647
  • Named large numbers: 5.5 billion
  • Fractions: 1/4

I did a bit of searching and could not find anything, though I would be surprised if such a library did not already exist.

like image 307
Kevin Ivarsen Avatar asked Dec 07 '09 06:12

Kevin Ivarsen


2 Answers

If you want to convert "localized" numbers such as the American "2,147,483,647" form, you can use the atof() function from the locale module. Example:

import locale
locale.setlocale(locale.LC_NUMERIC, 'en_US')
print locale.atof('1,234,456.23')  # Prints 1234456.23

As for fractions, Python now handles them directly (since version 2.6); they can even be built from a string:

from fractions import Fraction
x = Fraction('1/4')
print float(x)  # 0.25

Thus, you can parse a number written in any of the first 3 ways you mention, only with the help of the above two standard modules:

try:
    num = float(num_str)
except ValueError:
    try:
        num = locale.atof(num_str)
    except ValueError:
        try:
            num = float(Fraction(num_str))
        except ValueError:
            raise Exception("Cannot parse '%s'" % num_str)  # Or handle '42 billion' here
# 'num' has the numerical value of 'num_str', here.        
like image 140
Eric O Lebigot Avatar answered Oct 28 '22 06:10

Eric O Lebigot


It should be pretty straightforward to build one in pyparsing - in fact, one of the tutorial pyparsing projects does some of this (wordsToNum.py on this page) does some of it already. You're talking about things that don't really have standard representations (standard in the sense of ISO 8602, not standard in the sense of "what everybody knows"), so it could easily be that nobody's done just what you're looking for.

like image 24
Robert Rossney Avatar answered Oct 28 '22 06:10

Robert Rossney