Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Describing operator precedence using EBNF

I have written tokenizer and expression evaluator for a preprocessor language that I plan to use in my later projects. I started thinking that maybe I should describe the language with EBNF (Extended Backus–Naur Form) to keep the syntax more maintainable or even use it to generate later versions of a parser.

My first impression was that EBNF is used for tokenizing process and syntax validation. Later I discovered that it can also be used to describe operator precedence like in this post or in the Wikipedia article:

expression ::= equality-expression
equality-expression ::= additive-expression ( ( '==' | '!=' ) additive-expression ) *
additive-expression ::= multiplicative-expression ( ( '+' | '-' ) multiplicative-expression ) *
multiplicative-expression ::= primary ( ( '*' | '/' ) primary ) *
primary ::= '(' expression ')' | NUMBER | VARIABLE | '-' primary

I can see how that allows generator to produce code with operator precedence built in but is this really how precedence should be expressed? Isn't operator precedence more about semantics and EBNF about syntax? If I decide to write description of my language in EBNF, should I write it with operator precedence taken into account or document that in a separate section?

like image 403
Mika Lammi Avatar asked Aug 07 '14 14:08

Mika Lammi


1 Answers

Did a similar stuff for my collegue degree.

I suggest DO NOT use the operator precedence feature, even if looks easier like "syntact sugar".

Why ? Because most languages to be described by EBNF, use a lot of operators with different features, that are better to describe & update, with EBNF expressions, instead of operator precedence.

Some operators are unary prefix, some unary posfix, some are binary (a.k.a. "infix"), some binary are evaluated from left to right, & some are evaluated from right to left. Some symbols are operators in some context, and used as other tokens, in other context, example "+", "-", that can be binary operators ("x - y"), unary prefix operators ("x - -y"), or part of a literal ("x + -5").

In my experience its more "safe" to describe them with EBNF expressions. Unless the programming language you describe, is very small, with very few and similar syntax operators (example: all binary, or all prefix unary).

Just my 2 cents.

like image 153
umlcat Avatar answered Sep 30 '22 16:09

umlcat