Are there any Python libraries that help parse and validate numeric strings beyond what is supported by the built-in float() function? For example, in addition to simple numbers (1234.56) and scientific notation (3.2e15), I would like to be able to parse formats like:
I did a bit of searching and could not find anything, though I would be surprised if such a library did not already exist.
If you want to convert "localized" numbers such as the American "2,147,483,647" form, you can use the atof()
function from the locale module. Example:
import locale
locale.setlocale(locale.LC_NUMERIC, 'en_US')
print locale.atof('1,234,456.23') # Prints 1234456.23
As for fractions, Python now handles them directly (since version 2.6); they can even be built from a string:
from fractions import Fraction
x = Fraction('1/4')
print float(x) # 0.25
Thus, you can parse a number written in any of the first 3 ways you mention, only with the help of the above two standard modules:
try:
num = float(num_str)
except ValueError:
try:
num = locale.atof(num_str)
except ValueError:
try:
num = float(Fraction(num_str))
except ValueError:
raise Exception("Cannot parse '%s'" % num_str) # Or handle '42 billion' here
# 'num' has the numerical value of 'num_str', here.
It should be pretty straightforward to build one in pyparsing - in fact, one of the tutorial pyparsing projects does some of this (wordsToNum.py
on this page) does some of it already. You're talking about things that don't really have standard representations (standard in the sense of ISO 8602, not standard in the sense of "what everybody knows"), so it could easily be that nobody's done just what you're looking for.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With