I have another problem with my boost::spirit parser.
template<typename Iterator>
struct expression: qi::grammar<Iterator, ast::expression(), ascii::space_type> {
expression() :
expression::base_type(expr) {
number %= lexeme[double_];
varname %= lexeme[alpha >> *(alnum | '_')];
binop = (expr >> '+' >> expr)[_val = construct<ast::binary_op<ast::add>>(_1,_2)]
| (expr >> '-' >> expr)[_val = construct<ast::binary_op<ast::sub>>(_1,_2)]
| (expr >> '*' >> expr)[_val = construct<ast::binary_op<ast::mul>>(_1,_2)]
| (expr >> '/' >> expr)[_val = construct<ast::binary_op<ast::div>>(_1,_2)] ;
expr %= number | varname | binop;
}
qi::rule<Iterator, ast::expression(), ascii::space_type> expr;
qi::rule<Iterator, ast::expression(), ascii::space_type> binop;
qi::rule<Iterator, std::string(), ascii::space_type> varname;
qi::rule<Iterator, double(), ascii::space_type> number;
};
This was my parser. It parsed "3.1415"
and "var"
just fine, but when I tried to parse "1+2"
it tells me parse failed
. I've then tried to change the binop
rule to
binop = expr >>
(('+' >> expr)[_val = construct<ast::binary_op<ast::add>>(_1, _2)]
| ('-' >> expr)[_val = construct<ast::binary_op<ast::sub>>(_1, _2)]
| ('*' >> expr)[_val = construct<ast::binary_op<ast::mul>>(_1, _2)]
| ('/' >> expr)[_val = construct<ast::binary_op<ast::div>>(_1, _2)]);
But now it's of course not able to build the AST, because _1
and _2
are set differently. I have only seen something like _r1
mentioned, but as a boost-Newbie I am not quite able to understand how boost::phoenix
and boost::spirit
interact.
How to solve this?
It isn't entirely clear to me what you are trying to achieve. Most importantly, are you not worried about operator associativity? I'll just show simple answers based on using right-recursion - this leads to left-associative operators being parsed.
The straight answer to your visible question would be to juggle a fusion::vector2<char, ast::expression>
- which isn't really any fun, especially in Phoenix lambda semantic actions. (I'll show below, what that looks like).
Meanwhile I think you should read up on the Spirit docs
calculator
samples, which should give you a hint on why operator associativity matters, and how you would express a grammar that captures the associativity of binary operators. Obviously, it also shows how to support parenthesized expressions to override the default evaluation order.I have three version of code that works, parsing input like:
std::string input("1/2+3-4*5");
into an ast::expression
grouped like (using BOOST_SPIRIT_DEBUG):
<expr>
....
<success></success>
<attributes>[[1, [2, [3, [4, 5]]]]]</attributes>
</expr>
The links to the code are here:
- step_#1_reduce_semantic_actions.cpp
- step_#2_drop_rule.cpp
- step_#0_vector2.cpp
First thing, I'd get rid of the alternative parse expressions per operator; this leads to excessive backtracking1. Also, as you've found out, it makes the grammar hard to maintain. So, here is a simpler variation that uses a function for the semantic action:
1check that using BOOST_SPIRIT_DEBUG!
static ast::expression make_binop(char discriminant,
const ast::expression& left, const ast::expression& right)
{
switch(discriminant)
{
case '+': return ast::binary_op<ast::add>(left, right);
case '-': return ast::binary_op<ast::sub>(left, right);
case '/': return ast::binary_op<ast::div>(left, right);
case '*': return ast::binary_op<ast::mul>(left, right);
}
throw std::runtime_error("unreachable in make_binop");
}
// rules:
number %= lexeme[double_];
varname %= lexeme[alpha >> *(alnum | '_')];
simple = varname | number;
binop = (simple >> char_("-+*/") >> expr)
[ _val = phx::bind(make_binop, qi::_2, qi::_1, qi::_3) ];
expr = binop | simple;
_val
As you can see, this has the potential to reduce complexity. It is only a small step now, to remove the binop intermediate (which has become quite redundant):
number %= lexeme[double_];
varname %= lexeme[alpha >> *(alnum | '_')];
simple = varname | number;
expr = simple [ _val = _1 ]
> *(char_("-+*/") > expr)
[ _val = phx::bind(make_binop, qi::_1, _val, qi::_2) ]
> eoi;
As you can see,
expr
rule, the _val
lazy placeholder is used as a pseudo-local variable that accumulates the binops. Across rules, you'd have to use qi::locals<ast::expression>
for such an approach. (This was your question regarding _r1
).expr
rule no longer needs to be an auto-rule (expr =
instead of expr %=
)Finally, for fun and gory, let me show how you could have handled your suggested code, along with the shifting bindings of _1, _2 etc.:
static ast::expression make_binop(
const ast::expression& left,
const boost::fusion::vector2<char, ast::expression>& op_right)
{
switch(boost::fusion::get<0>(op_right))
{
case '+': return ast::binary_op<ast::add>(left, boost::fusion::get<1>(op_right));
case '-': return ast::binary_op<ast::sub>(left, boost::fusion::get<1>(op_right));
case '/': return ast::binary_op<ast::div>(left, boost::fusion::get<1>(op_right));
case '*': return ast::binary_op<ast::mul>(left, boost::fusion::get<1>(op_right));
}
throw std::runtime_error("unreachable in make_op");
}
// rules:
expression::base_type(expr) {
number %= lexeme[double_];
varname %= lexeme[alpha >> *(alnum | '_')];
simple = varname | number;
binop %= (simple >> (char_("-+*/") > expr))
[ _val = phx::bind(make_binop, qi::_1, qi::_2) ]; // note _2!!!
expr %= binop | simple;
As you can see, not nearly as much fun writing the make_binop
function that way!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With