I can't get the string value of a token

Tags:

I try to implement a Lexer for a little programming language with Boost Spirit.

I have to get the value of a token and I get a bad_get exception :

terminate called after throwing an instance of 'boost::bad_get'
what(): boost::bad_get: failed value get using boost::get Aborted

I obtain this exception when doing :

std::string contents = "void";

base_iterator_type first = contents.begin();
base_iterator_type last = contents.end();

SimpleLexer<lexer_type> lexer;

iter = lexer.begin(first, last);
end = lexer.end();

std::cout << "Value = " << boost::get<std::string>(iter->value()) << std::endl;

My lexer is defined like that :

typedef std::string::iterator base_iterator_type;
typedef boost::spirit::lex::lexertl::token<base_iterator_type, boost::mpl::vector<unsigned int, std::string>> Tok;
typedef lex::lexertl::actor_lexer<Tok> lexer_type;

template<typename L>
class SimpleLexer : public lex::lexer<L> {
    private:

    public:
        SimpleLexer() {
            keyword_for = "for";
            keyword_while = "while";
            keyword_if = "if";
            keyword_else = "else";
            keyword_false = "false";
            keyword_true = "true";
            keyword_from = "from";
            keyword_to = "to";
            keyword_foreach = "foreach";

            word = "[a-zA-Z]+";
            integer = "[0-9]+";
            litteral = "...";

            left_parenth = '('; 
            right_parenth = ')'; 
            left_brace = '{'; 
            right_brace = '}'; 

            stop = ';';
            comma = ',';

            swap = "<>";
            assign = '=';
            addition = '+';
            subtraction = '-';
            multiplication = '*';
            division = '/';
            modulo = '%';

            equals = "==";
            not_equals = "!=";
            greater = '>';
            less = '<';
            greater_equals = ">=";
            less_equals = "<=";

            whitespaces = "[ \\t\\n]+";
            comments = "\\/\\*[^*]*\\*+([^/*][^*]*\\*+)*\\/";

            //Add keywords
            this->self += keyword_for | keyword_while | keyword_true | keyword_false | keyword_if | keyword_else | keyword_from | keyword_to | keyword_foreach;
            this->self += integer | litteral | word;

            this->self += equals | not_equals | greater_equals | less_equals | greater | less ;
            this->self += left_parenth | right_parenth | left_brace | right_brace;
            this->self += comma | stop;
            this->self += assign | swap | addition | subtraction | multiplication | division | modulo;

            //Ignore whitespaces and comments
            this->self += whitespaces [lex::_pass = lex::pass_flags::pass_ignore];
            this->self += comments [lex::_pass = lex::pass_flags::pass_ignore]; 
        }

        lex::token_def<std::string> word, litteral, integer;

        lex::token_def<lex::omit> left_parenth, right_parenth, left_brace, right_brace;

        lex::token_def<lex::omit> stop, comma;

        lex::token_def<lex::omit> assign, swap, addition, subtraction, multiplication, division, modulo;
        lex::token_def<lex::omit> equals, not_equals, greater, less, greater_equals, less_equals;

        //Keywords
        lex::token_def<lex::omit> keyword_if, keyword_else, keyword_for, keyword_while, keyword_from, keyword_to, keyword_foreach;
        lex::token_def<lex::omit> keyword_true, keyword_false;

        //Ignored tokens
        lex::token_def<lex::omit> whitespaces;
        lex::token_def<lex::omit> comments;
};

Is there an other way to get the value of a Token ?

742

asked Oct 14 '11 08:10

Baptiste Wicht

1 Answers

You can always use the 'default' token data (which is iterator_range of the source iterator type).

std::string tokenvalue(iter->value().begin(), iter->value().end());

After studying the test cases in the boost repository, I found out a number of things:

this is by design
there is an easier way
the easier way comes automated in Lex semantic actions (e.g. using _1) and when using the lexer token in Qi; the assignment will automatically convert to the Qi attribute type
this has (indeed) got the 'lazy, one-time, evaluation' semantics mentioned in the docs

The cinch is that the token data is variant, which starts out as the raw input iterator range. Only after 'a' forced assignment, the converted attribute is cached in the variant. You can witness the transition:

lexer_type::iterator_type iter = lexer.begin(first, last);
lexer_type::iterator_type end = lexer.end();

assert(0 == iter->value().which());
std::cout << "Value = " << boost::get<boost::iterator_range<base_iterator_type> >(iter->value()) << std::endl;

std::string s;
boost::spirit::traits::assign_to(*iter, s);
assert(1 == iter->value().which());
std::cout << "Value = " << s << std::endl;

As you can see, the attribute assignment is forced here, directly using the assign_to trait implementation.

Full working demonstration:

#include <boost/spirit/include/lex_lexertl.hpp>

#include <iostream>
#include <string>

namespace lex = boost::spirit::lex;

typedef std::string::iterator base_iterator_type;
typedef boost::spirit::lex::lexertl::token<base_iterator_type, boost::mpl::vector<int, std::string>> Tok;
typedef lex::lexertl::actor_lexer<Tok> lexer_type;

template<typename L>
class SimpleLexer : public lex::lexer<L> {
    private:

    public:
        SimpleLexer() {
            word = "[a-zA-Z]+";
            integer = "[0-9]+";
            literal = "...";

            this->self += integer | literal | word;
        }

        lex::token_def<std::string> word, literal;
        lex::token_def<int> integer;
};

int main(int argc, const char* argv[]) {
    SimpleLexer<lexer_type> lexer;

    std::string contents = "void";

    base_iterator_type first = contents.begin();
    base_iterator_type last = contents.end();

    lexer_type::iterator_type iter = lexer.begin(first, last);
    lexer_type::iterator_type end = lexer.end();

    assert(0 == iter->value().which());
    std::cout << "Value = " << boost::get<boost::iterator_range<base_iterator_type> >(iter->value()) << std::endl;

    std::string s;
    boost::spirit::traits::assign_to(*iter, s);
    assert(2 == iter->value().which());
    std::cout << "Value = " << s << std::endl;

    return 0;
}

answered Sep 26 '22 01:09

sehe

Related questions
                            
                                C++ compiling problem; class methods
                            
                                Using OpenCV descriptor matches with findFundamentalMat
                            
                                Overloading c++ typecasting (functions)
                            
                                How to overload the ostream operator << to make it work with log4cxx in C++?
                            
                                deducing references to const from rvalue arguments
                            
                                huge executables because of debugging symbols, why?
                            
                                Which GUI library is used to develop Mozilla Firefox?
                            
                                Is boost::io_service::post thread safe?
                            
                                SWIG C++ Python: wrapping int by reference or pointer
                            
                                Why does enable_shared_from_this embed a weak pointer instead of embedding the reference counter directly?
                            
                                How do I check and handle numbers very close to zero
                            
                                How to set g++ compiler flags using Rcpp and inline?
                            
                                Is there a pragma directive for include directories?
                            
                                Can functional/immutable data structures still be useful for concurrency in a non-garbage collected context?
                            
                                OpenMP - Running parallel code inside parallel code
                            
                                Does a template type waste space in C++?
                            
                                Name spaces in c++ and c
                            
                                Getting the coordinates of points from a Boost Geometry polygon
                            
                                Why do C# and VB.NET implicitly marshal char* differently?
                            
                                Is it safe to disable MSVC warning C4482?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

I can't get the string value of a token

Tags:

c++

boost

boost-spirit

boost-spirit-lex

Baptiste Wicht

People also ask

1 Answers

sehe

Recent Activity

Donate For Us