Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parsing errors with Bison

I'm writing my own scripting language using flex and bison. I have a grammar and I'm able to generate a parser which works fine with a correct script. I would like to be able to add also some meaningful error message for special error situations. For example I would like to be able to recognize unmatched parenthesis for a block of statements or a missing semicolon and so on. Suppose I have these statements (here the grammar is not complete):

...
statements: statement SEMICOLON statements
    | statement SEMICOLON;

statement: ifStatement
    | whileStatement
    ;

ifStatement: IF expression THEN statements END
    | IF expression THEN statements ELSE statements END
    ;

whileStatement:  DO statements WHILE expression END
    ;
...

I would like to be able to print messages such as "Missing semicolon" or "Missing then keyword" and so on. Should I modify my grammar to enable error handling? Or is there some Bison feature to do this?

like image 620
Salvatore Avatar asked Mar 23 '13 13:03

Salvatore


1 Answers

Update (Sept 2021)

Since version 3.7 Bison supports user-defined error messages: specify %define parse.error custom, and provide a yyreport_syntax_error function, something like:

int
yyreport_syntax_error (const yypcontext_t *ctx)
{
  int res = 0;
  YYLOCATION_PRINT (stderr, *yypcontext_location (ctx));
  fprintf (stderr, ": syntax error");
  // Report the tokens expected at this point.
  {
    enum { TOKENMAX = 10 };
    yysymbol_kind_t expected[TOKENMAX];
    int n = yypcontext_expected_tokens (ctx, expected, TOKENMAX);
    if (n < 0)
      // Forward errors to yyparse.
      res = n;
    else
      for (int i = 0; i < n; ++i)
        fprintf (stderr, "%s %s",
                 i == 0 ? ": expected" : " or", yysymbol_name (expected[i]));
  }
  // Report the unexpected token.
  {
    yysymbol_kind_t lookahead = yypcontext_token (ctx);
    if (lookahead != YYSYMBOL_YYEMPTY)
      fprintf (stderr, " before %s", yysymbol_name (lookahead));
  }
  fprintf (stderr, "\n");
  return res;
}

More about this in the The Syntax Error Reporting Function yyreport_syntax_error section of the documentation.

Original Answer (March 2013)

Bison is not the proper tool to generate custom error messages, yet its standard error messages are not too bad either, provided you enable %error-verbose. Have a look at the documentation: http://www.gnu.org/software/bison/manual/bison.html#Error-Reporting.

If you really want to provide custom error message, do read the documentation about YYERROR, and generate rules for the patterns you want to catch, and raise errors yourself. For instance, here dividing by 0 is treated as a syntax error (which is dubious, but provides an example of custom syntax error messages).

 exp:
   NUM           { $$ = $1; }
 | exp '+' exp   { $$ = $1 + $3; }
 | exp '-' exp   { $$ = $1 - $3; }
 | exp '*' exp   { $$ = $1 * $3; }
 | exp '/' exp
     {
       if ($3)
         $$ = $1 / $3;
       else
         {
           $$ = 1;
           fprintf (stderr, "%d.%d-%d.%d: division by zero",
                    @3.first_line, @3.first_column,
                    @3.last_line, @3.last_column);
         }
     }

Note also that providing strings for tokens generates better error messages:

%token NUM

would generate unexpected NUM, while

%token NUM "number"

would generate unexpected number.

like image 120
akim Avatar answered Sep 20 '22 23:09

akim