Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why Babel uses a top down parser?

I'm studying compiler construction and naturally I'm also studying real world implementations of these concepts. One example of this is Babel's parser: Babylon.

I went through Babylon's code and it appears to be using a Top Down parser with embedded ad hoc semantic rules. src

I was expecting Babel to be using a member of the LR parsers and probably a definition file where the grammar productions are coupled together with semantic rules. Why? Well mostly because a bunch of other real world langs use lr parser generators such as Yacc, Bison, et al, that give you this exact interface, and seems to be a clearer and more maintainable way of representing these rules, and even more when you consider that Babel lives on the edge of the Javascript standard, implementing new things all the time.

I also have constructed both top down and bottom up (lr) parsers and I don't see a big implementation difficulty difference between the two (both are equally difficult :) )

So, why does Babel's parser uses a top down ad hoc syntax directed translations instead of what I see as a more structured approach? What are the design decisions behind that? What am I missing?

Thanks!

like image 546
franleplant Avatar asked Dec 18 '22 03:12

franleplant


1 Answers

I feel like you're really asking two (or maybe three) questions, so I'll address them separately

In general what are the advantages and disadvantages of different approaches to parsing

Top down vs. bottom up

For hand-written parsers the situation is actually pretty clear: Top-down parsers are much easier to write and maintain to the point that I've never even seen a hand-written bottom-up parser.

For parser generators the situation is less clear. Both types of parser generators exist (for example yacc and bison are bottom-up and ANTLR and JavaCC are top-down). Both have their advantages and disadvantages and I don't think there's much cause to say that one approach is clearly better than the other.

In fact I'd say it usually makes no sense to decide between top-down and bottom-up parsing. When hand-writing your parser, always go with the former. When using a parser generator, you should simply choose the tool whose features best fit your project, not based on whether it generates bottom-up or top-down parsers.

Hand-written parsers vs. parser generators

There are many reasons why one would hand-write parsers. These also depend on which parser-generators are even available for the language. One short-coming that parser generators often suffer from is that they make it hard to generate good error messages for syntax errors.

Another possible problem is that for non-context free languages you might need some dirty hacks to implement them using a parser generator or it might just not be possible at all.

How do these factors apply specifically to Babylon

Hand-written parsers vs. parser generators

The JavaScript grammar is quite complicated with a lot of special cases to resolve ambiguities. It would probably require extensive hacks when using a parser generator and might not be possible at all with the parser generators available for JavaScript.

I would also say that the parser generators available for JavaScript might not yet be production-ready and were even less so when the project was first created.

Top down vs. bottom up

As I said, I've never ever seen a hand-written bottom-up parser. So the decision to write a top-down parser is a no-brainer once you decide to go with a hand-written parser.

like image 190
sepp2k Avatar answered Dec 24 '22 02:12

sepp2k