Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Yacc/Bison: The pseudo-variables ($$, $1, $2,..) and how to print them using printf

Tags:

yacc

bison

I have a lexical analyser written in flex that passes tokens to my parser written in bison.

The following is a small part of my lexer:

ID [a-z][a-z0-9]*

%%

rule {
    printf("A rule: %s\n", yytext);
    return RULE;
}

{ID} { 
    printf( "An identifier: %s\n", yytext );
    return ID;
}

"(" return LEFT;
")" return RIGHT;

There are other bits for parsing whitespace etc too.

Then part of the parser looks like this:

%{
#include <stdio.h>
#include <stdlib.h>
#define YYSTYPE char*
%}

%token ID RULE 
%token LEFT RIGHT 

%%

rule_decl : 
    RULE LEFT ID RIGHT { printf("Parsing a rule, its identifier is: %s\n", $2); }
    ;

%%

It's all working fine but I just want to print out the ID token using printf - that's all :). I'm not writing a compiler.. it's just that flex/bison are good tools for my software. How are you meant to print tokens? I just get (null) when I print.

Thank you.

like image 381
ale Avatar asked Jul 05 '11 20:07

ale


People also ask

What is Bison in yacc?

Bison is an yacc like GNU. parser generator. b. . It takes the language specification in the form of an LALR grammar and generates the parser.

Is yacc and Bison same?

Bison is the GNU implementation/extension of Yacc, Flex is the successor of Lex. In either case, it's fine (and recommended) to use bison / flex. Additionally, byacc, the Berkeley implementation of yacc, is widely available (I see it in my Debian repository list).

What do you mean by Flex Lex yacc and Bison?

Lex and Yacc were the first popular and efficient lexers and parsers generators, flex and Bison were the first widespread open-source versions compatible with the original software. Each of these software has more than 30 years of history, which is an achievement in itself.

What is the default type of a number in yacc?

c. By default, yacc prefixes all variables and defined parameters in the generated parser code with the two letters yy (or YY ).


1 Answers

I'm not an expert at yacc, but the way I've been handling the transition from the lexer to the parser is as follows: for each lexer token, you should have a separate rule to "translate" the yytext into a suitable form for your parser. In your case, you are probably just interested in yytext itself (while if you were writing a compiler, you'd wrap it in a SyntaxNode object or something like that). Try

%token ID RULE 
%token LEFT RIGHT

%%

rule_decl:
    RULE LEFT id RIGHT { printf("%s\n", $3); }

id:
    ID { $$ = strdup(yytext); }

The point is that the last rule makes yytext available as a $ variable that can be referenced by rules involving id.

like image 191
Aasmund Eldhuset Avatar answered Sep 27 '22 20:09

Aasmund Eldhuset