Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C++ istream with lex

I have a working grammar (written in lex and bison) that parses polynomial expressions. It is like your standard, text-book calculator-like syntax. Here is a very simplified version of the grammar:

Expr
: DOUBLE        {$$ = newConstExpr($1);}
| Expr '+' Expr {$$ = newBinaryExpr('+', $1, $2);}
| Expr '*' Expr {$$ = NewBinaryExpr('*', $1, $2);}
| '(' Expr ')'  {$$ = $2;}
;

My problem is that Lex uses a FILE* for yyin, and I need to parse input from a C++ istream. I know that flex++ can generate the FlexLexer class (which can take an istream in its constructo), but it is difficult to get it to mesh with Bison, and even the author himself claims (in the comments in the generated lexer file) that it is buggy.

So I am wondering if anyone knows a good way to use a flex scanner and bison parser with a C++ istream object as the input instead of a FILE*.

like image 758
Nick Avatar asked Dec 02 '25 06:12

Nick


2 Answers

You can get input into lex however you want by definining a custom YY_INPUT macro.

For a real-world example, take a look at my:

http://www.kylheku.com/cgit/txr/tree/parser.l

Here, I redirect the flex scanner to work with special stream objects which are part of a dynamic object library. Like iostreams, these are not FILE *.

This allows me to do things like lexically analyze the command line when the program is run with -c <script text>.

(As an aside, the scanner works with 8 bit bytes. This is why the YY_INPUT macro uses my get_byte function. When the yyin_stream is a string stream, the get_byte implementation will actually put out the UTF-8 encoding bytes corresponding to the Unicode chars inside the string, so multiple get_byte calls may be necessary before the stream advances to the next character of the string. Over a file stream, get_byte just gets the byte from the underlying OS stream.)

like image 141
Kaz Avatar answered Dec 03 '25 20:12

Kaz


This is a working example of a custom YY_INPUT macro to read from an interactive istream.

%{
// Place this code in istr.l and run with:
// $ flex istr.l && c++ istr.cpp && ./a.out
// $ flex istr.l && c++ istr.cpp && ./a.out 1a2b 123 abc
#include <iostream>

// The stream the lexer will read from.
// Declared as an extern
extern std::istream *lexer_ins_;

// Define YY_INPUT to get from lexer_ins_
// This definition mirrors the functionality of the default
// interactive YY_INPUT
#define YY_INPUT(buf, result, max_size)  \
  result = 0; \
  while (1) { \
    int c = lexer_ins_->get(); \
    if (lexer_ins_->eof()) { \
      break; \
    } \
    buf[result++] = c; \
    if (result == max_size || c == '\n') { \
      break; \
    } \
  }

%}

/* Turn on all the warnings, don't call yywrap. */
%option warn nodefault noyywrap
/* stdinit not required - since using streams. */
%option nostdinit
%option outfile="istr.cpp"

%%
      /* Example rules. */
[0-9] { std::cout << 'd'; }
\n    { std::cout << std::endl; }
.     { std::cout << '.'; }
<<EOF>> { yyterminate(); }
%%

//
// Example main. This could be in its own file.
//
#include <sstream>

// Define actual lexer stream 
std::istream *lexer_ins_;

int main(int argc, char** argv) {
  if (argc == 1) {
    // Use stdin
    lexer_ins_ = &std::cin;
    yylex();
  } else {
    // Use a string stream
    std::string data;
    for (int n = 1; n < argc; n++) {
      data.append(argv[n]);
      data.append("\n");
    }
    lexer_ins_ = new std::istringstream(data);
    yylex();
  }
}

This style of scanner - using C++ but generated in the C-style - works fine for me. You might also try the experimental Flex option %option c++. See "Generating C++ Scanners" in the Flex manual. There doesn't seem to be much information about integrating these scanners with a Bison parser.

Finally, in case reading from memory is sufficient for your use case, you might be able to avoid redefining YY_INPUT - see yy_scan_buffer() in the Flex manual.

like image 31
Alan Green Avatar answered Dec 03 '25 20:12

Alan Green



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!