Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Grammar: difference between a top down and bottom up? (Example)

This is a follow up question from Grammar: difference between a top down and bottom up?

I understand from that question that:

  • the grammar itself isn't top-down or bottom-up, the parser is
  • there are grammars that can be parsed by one but not the other
  • (thanks Jerry Coffin

So for this grammar (all possible mathematical formulas):

    E -> E T E
    E -> (E)
    E -> D

    T -> + | - | * | /

    D -> 0
    D -> L G

    G -> G G    
    G -> 0 | L

    L -> 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 

Would this be readable by a top down and bottom up parser?

Could you say that this is a top down grammar or a bottom up grammar (or neither)?


I am asking because I have a homework question that asks:

"Write top-down and bottom-up grammars for the language consisting of all ..." (different question)

I am not sure if this can be correct since it appears that there is no such thing as a top-down and bottom-up grammar. Could anyone clarify?

like image 437
sixtyfootersdude Avatar asked Jul 05 '10 21:07

sixtyfootersdude


1 Answers

That grammar is stupid, since it unites lexing and parsing as one. But ok, it's an academic example.

The thing with bottoms-up and top-down is that is has special corner cases that are difficult to implement with you normal 1 look ahead. I probably think that you should check if it has any problems and change the grammar.

To understand you grammar I wrote a proper EBNF

expr:
    expr op expr |
    '(' expr ')' |
    number;

op:
    '+' |
    '-' |
    '*' |
    '/';

number:
    '0' |
    digit digits;

digits:
    '0' |
    digit |
    digits digits;

digit:
    '1' | 
    '2' | 
    '3' | 
    '4' | 
    '5' | 
    '6' | 
    '7' | 
    '8' | 
    '9'; 

I especially don't like the rule digits: digits digits. It is unclear where the first digits starts and the second ends. I would implement the rule as

digits:
    '0' |
    digit |
    digits digit;

An other problem is number: '0' | digit digits; This conflicts with digits: '0' and digits: digit;. As a matter of fact that is duplicated. I would change the rules to (removing digits):

number:
    '0' |
    digit |
    digit zero_digits;

zero_digits:
    zero_digit |
    zero_digits zero_digit;

zero_digit:
    '0' |
    digit;

This makes the grammar LR1 (left recursive with one look ahead) and context free. This is what you would normally give to a parser generator such as bison. And since bison is bottoms up, this is a valid input for a bottoms-up parser.

For a top-down approach, at least for recursive decent, left recursive is a bit of a problem. You can use roll back, if you like but for these you want a RR1 (right recursive one look ahead) grammar. To do that swap the recursions:

zero_digits:
    zero_digit |
    zero_digit zero_digits;

I am not sure if that answers you question. I think the question is badly formulated and misleading; and I write parsers for a living...

like image 50
rioki Avatar answered Sep 29 '22 12:09

rioki