We have a grammar written for antlr V2 and I would like to migrate to antlr v4. Is there any migration Guide? I would also like to know modifications of existing V2 grammar so that we utilize v4 features well.
As we've previously mentioned, ANTLR4 no longer builds an AST for you directly from the grammar. We'll have to write code to do that. We have two options: Produce the very same AST that the ANTLR2 parser produced: same classes, same structure.
It's better to use an off-the-shelf parser (generator) such as ANTLR when you want to develop and use a custom language. It's better to write your own parser when your objective is to write a parser. UNLESS you have a lot of experience writing parsers and can get a working parser that way more quickly than using ANTLR.
I solved this by writing a new Antlr 4 grammar file. There is no good transform from Antlr 2 to Antlr 4.
nice to meet you again!
We recently migrated a set of large grammars to ANTLR 4 and wrote some lessons here: https://tomassetti.me/migrating-from-antlr2-to-antlr4/
Let me summarize the main points here.
ANTLR 4 has features that make grammars more concise and maintenable
ANTLR2 supports only a few target platforms: Java, C#, and C++ while ANTLR4 supports many more
ANTLR4 accepts left-recursive grammars: this a big one, as it leads to far simpler and "less deep" grammars
ANTLR4 parsers employ the adaptive LL(*) algorithm: no need for you to determine "k", which was never trivial to do
ANTLR4 no longer builds an abstract syntax tree (AST). This one will impact your migration the most
In the article we go into the details about translating the single options or the actions on tokens.
The core part is how to handle tree-rewriting rules, which are not present in ANTLR 4 anymore.
In practice you will need a library to define the AST, which you will obtain by simplifying the parse-tree produced by ANTLRv4. Now, in ANTLR v2 you used to do that in the grammar itself, while when using ANTLR v4 you will do that as a follow-up step. This is good, because you will have two simpler phases instead of one single convoluted grammar (good for maintenability and testability). However it would require you to write a little library to represent the AST.
In case you use the Java target you may be interested in using this open-source library to represent the AST: https://github.com/Strumenta/kolasu
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With