I'm writing a Bison/Flex program to convert LaTeX into MathML. At the moment, dealing with functions (i.e. \sqrt, \frac, etc) works like this, with a token for every function
\\frac {return FUNC_FRAC;}
and passes the token FUNC_FRAC back to bison, which plays its part in the description of this subtree:
function: FUNC_FRAC LBRACE atom RBRACE LBRACE atom RBRACE {$$ = "<mfrac>" + $3 + $6 + "</mfrac>";}
But this means that I need to define and juggle a potentially unlimited number of tokens. What I would like to do is something like this, which doesn't work as written. In flex:
\\[A-Za-z]+[0-9]* {return the-matched-string;}
and in bison:
function: "\frac" LBRACE atom RBRACE LBRACE atom RBRACE {$$ = "<mfrac>" + $3 + $6 + "</mfrac>";}
Flex should return the abstract token value to Bison.
You can find the lexeme (the string matched) in Flex in the value:
yytext
And so you can do:
{id} { yylval->strval=strdup(yytext); return(TOK_ID); }
And so forth. The yylval
struct relates IIRC to the bison union/whatever you are using to evaluate past the token-type .. so I might have in Bison
%union {
char *strval;
int intval;
node node_val;
}
Returning anything other than a token-type will break the automaton in Bison. Your Bison actions can access such as:
id_production: TOK_ID
{
$<node_val>$ = create_id_node(yylval.strval);
xfree(yylval.strval); // func makes a copy, so we are cool.
}
And so on. Any more explanation than this and I will probably start repeating documentation. Things to consult:
Good Luck
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With