I would like to write a Boost::Spirit::X3 parser to parse complex number with the following possible input format:
"(X+Yi)""Yj""X"My best attempt so far is the following (Open on Coliru):
#include <complex>
#include <iostream>
#include <boost/fusion/include/adapt_struct.hpp>
#include <boost/spirit/home/x3.hpp>
#include <boost/spirit/home/x3/support/utility/error_reporting.hpp>
namespace x3 = boost::spirit::x3;
struct error_handler {
template <typename iterator_t, typename error_t, typename context_t>
auto on_error(iterator_t& /* iter */, const iterator_t& /* end */, const error_t& error,
const context_t& context) {
namespace x3 = boost::spirit::x3;
const auto& handler = x3::get<x3::error_handler_tag>(context).get();
handler(error.where(), "error: expecting: " + error.which());
return x3::error_handler_result::fail;
}
};
// -----------------------------------------------------------------------------
namespace ast {
template <typename T>
struct complex_number {
T real;
T imag;
operator std::complex<T>() {
return {real, imag};
}
};
} // namespace ast
BOOST_FUSION_ADAPT_STRUCT(ast::complex_number<double>, real, imag);
// -----------------------------------------------------------------------------
namespace parser {
const auto pure_imag_number = x3::attr(0.) > x3::double_ > x3::omit[x3::char_("ij")];
const auto pure_real_number = x3::double_ > x3::attr(0.);
struct complex_class : error_handler {};
const x3::rule<complex_class, ast::complex_number<double>> complex = "Complex number";
static const auto complex_def = ('(' > (x3::double_ > -(x3::double_ > x3::omit[x3::char_("ij")])) >> ')')
| pure_imag_number
| pure_real_number;
BOOST_SPIRIT_DEFINE(complex);
} // namespace parser
// =============================================================================
void parse(const std::string& str) {
using iterator_t = std::string::const_iterator;
auto iter = std::begin(str);
auto end = std::end(str);
boost::spirit::x3::error_handler<iterator_t> handler(iter, end, std::cerr);
const auto parser = boost::spirit::x3::with<boost::spirit::x3::error_handler_tag>(
std::ref(handler))[parser::complex];
std::complex<double> result{};
if (boost::spirit::x3::phrase_parse(iter, end, parser, x3::space, result) && iter == end) {
std::cout << "Parsing successful for:' " << str << "'\n";
} else {
std::cout << "Parsing failed for:' " << str << "'\n";
}
}
int main() {
for (const auto& str : {
"(1+2j)",
"(3+4.5j)",
"1.23j",
"42",
}) {
parse(str);
}
return 0;
}
Which gives the following results when running the compiled code (with GCC 12.1.1 and Boost 1.79.0):
Parsing successful for:' (1+2j)'
Parsing successful for:' (3+4.5j)'
Parsing successful for:' 1.23j'
In line 1:
error: expecting: N5boost6spirit2x314omit_directiveINS1_8char_setINS0_13char_encoding8standardEcEEEE
42
__^_
Parsing failed for:' 42'
What I am puzzled by is why the last alternative is not considered valid when parsing the string with only a real number within it.
You already found that expectation points are too forcing if you need to allow backtracking.
Beware, though, that your grammar is a bit funny w.r.t. separating the values with only a unary sign included in the double_ parser.
Here's a simplified test that highlights some of the edge cases:
static const auto ij = x3::omit[x3::char_("ij")];
static const auto implied = x3::attr(0.);
static const auto complex =
x3::rule<struct complex_, ast::complex_number<double>>{"complex"} //
= ('(' >> x3::double_ >> ((x3::double_ >> ij) | implied) >> ')') //
| implied >> x3::double_ >> ij //
| x3::double_ >> implied;
With the complete test Live On Coliru printing
Parsing successful for: '(1+2j)' -> (1,2)
Parsing successful for: '(1 2j)' -> (1,2)
Parsing successful for: '(+1+2j)' -> (1,2)
Parsing successful for: '(+1-2j)' -> (1,-2)
Parsing successful for: '(-1-2j)' -> (-1,-2)
Parsing successful for: '(3+4.5j)' -> (3,4.5)
Parsing successful for: '1.23j' -> (0,1.23)
Parsing successful for: '42' -> (42,0)
Parsing successful for: 'inf' -> (inf,0)
Parsing successful for: '-infj' -> (0,-inf)
Parsing successful for: 'NaNj' -> (0,nan)
Parsing successful for: '(.0e9)' -> (0,0)
Parsing successful for: '(.0e-4)' -> (0,0)
Parsing successful for: '.0e-4i' -> (0,0)
Parsing successful for: '.0e-4j' -> (0,0)
Parsing successful for: '(3-0.e-4j)' -> (3,-0)
Parsing successful for: '(3-.0e-4j)' -> (3,-0)
Note that allowing whitespace in the non-parenthesized versions can easily lead to problems (ambiguous inputs/surprising misparses). I'd suggest maybe you only want to skip blanks inside parentheses:
static const auto complex =
x3::rule<struct complex_, ast::complex_number<double>>{"complex"} //
= x3::skip(x3::blank)['(' >> x3::double_ >>
((x3::double_ >> ij) | implied) >> ')'] //
| x3::lexeme[implied >> x3::double_ >> ij //
| x3::double_ >> implied];
So, @Eljay's comment is right...
The issue stems from the use of > instead of >> to allow the failures without triggering the error handler upon failure.
So to actually succeed, we need to use >> in these places:
const auto pure_imag_number = x3::attr(0.) >> x3::double_ >> x3::omit[x3::char_("ij")];
const auto pure_real_number = x3::double_ >> x3::attr(0.);
And only use > when we really want to abort immediately and report an error.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With