We've just started to kick the tires pyparsing and like it so far, but we've been unable to get it to help us parse fractional number strings to turn them into numeric data types.
For example, if a column value in a database table contained the string:
1 1/2
We'd like some way to convert it into the numeric python equivalent:
1.5
We'd like to make a parser that doesn't care whether the numbers in the fraction are integer or real. For example, we'd like:
1.0 1.0/2.0
...to still translate to:
1.5
Essentially we'd like a parser conceptually to do the following:
"1 1/2" = 1 + 0.5 = 1.5
The following example code seems to get us close...
http://pyparsing.wikispaces.com/file/view/parsePythonValue.py
...but not close enough to make headway. All our tests to make a fractional number handler only return the first part of the expression (1). Tips? Hints? Timely Wisdom? :)
Since you cite some tests, it sounds like you've at least taken a stab at the problem. I assume you've already defined a single number, which can be integer or real - doesn't matter, you are converting everything to float anyway - and a fraction of two numbers, probably something like this:
from pyparsing import Regex, Optional
number = Regex(r"\d+(\.\d*)?").setParseAction(lambda t: float(t[0]))
fraction = number("numerator") + "/" + number("denominator")
fraction.setParseAction(lambda t: t.numerator / t.denominator)
(Note the use of parse actions, which do the floating point conversion and fractional division right at parse time. I prefer to do this while parsing, when I know something is a number or a fraction or whatever, instead of coming back later and sifting through a bunch of fragmented strings, trying to recreate the recognition logic that the parser has already done.)
Here are the test cases I composed for your problem, made up of a whole number, a fraction, and a whole number and fraction, using both integers and reals:
tests = """\
1
1.0
1/2
1.0/2.0
1 1/2
1.0 1/2
1.0 1.0/2.0""".splitlines()
for t in tests:
print t, fractExpr.parseString(t)
The last step is how to define a fractional expression that can be a single number, a fraction, or a single number and a fraction.
Since pyparsing is left-to-right, it does not do the same kind of backtracking like regexen do. So this expression wont work so well:
fractExpr = Optional(number) + Optional(fraction)
To sum together the numeric values that might come from the number and fraction parts, add this parse action:
fractExpr.setParseAction(lambda t: sum(t))
Our tests print out:
1 [1.0]
1.0 [1.0]
1/2 [1.0]
1.0/2.0 [1.0]
1 1/2 [1.5]
1.0 1/2 [1.5]
1.0 1.0/2.0 [1.5]
For the test case 1/2
, containing just a fraction by itself, the leading numerator matches the Optional(number)
term, but that leaves us just with "/2", which doesn't match the Optional(fraction)
- fortunately, since the second term is optional, this "passes", but it's not really doing what we want.
We need to make fractExpr a little smarter, and have it look first for a lone fraction, since there is this potential confusion between a lone number and the leading numerator of a fraction. The easiest way to do this is to make fractExpr read:
fractExpr = fraction | number + Optional(fraction)
Now with this change, our tests come out better:
1 [1.0]
1.0 [1.0]
1/2 [0.5]
1.0/2.0 [0.5]
1 1/2 [1.5]
1.0 1/2 [1.5]
1.0 1.0/2.0 [1.5]
There are a couple of classic pitfalls with pyparsing, and this is one of them. Just remember that pyparsing only does the lookahead that you tell it to, otherwise it is just straight left-to-right parsing.
Not precisely what you're looking for, but...
>>> import fractions
>>> txt= "1 1/2"
>>> sum( map( fractions.Fraction, txt.split() ) )
Fraction(3, 2)
>>> float(_)
1.5
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With