Are there any existing C++ grammar files for ANTLR?
I'm looking to lex, not parse some C++ source code files.
I've looked on the ANTLR grammar page and it looks like there is one listed created by Sun Microsystems here.
However, it seems to be a generated Parser.
Can anyone point me to a C++ ANTLR lexer or grammar file?
A language is specified using a context-free grammar expressed using Extended Backus–Naur Form (EBNF). ANTLR can generate lexers, parsers, tree parsers, and combined lexer-parsers.
Getting Started with ANTLR in C++ ANTLR can generate parsers in many languages: Java, C#, Python (2 and 3), JavaScript, Go, Swift, Dart, PHP and C++.
ANTLR is a great tool to quickly create parsers and help you work with a known language or create your DSL. While the tool itself is written in Java, it can also be used to generate parsers in several other languages like Python, C#, or JavaScript (with more languages supported by the recently released 4.6 version).
In fact, there are context-free grammars that you can "specify" with ANTLR that it cannot process correctly, which is true of most parser generators. (For ANTLR, this includes grammars with indirect left recursion, ambiguity, arbitrary lookahead, etc.)
C++ parsers are tough to build.
I can't speak with experience about using ANTLR's C++ grammars. Here I discuss what I learned by reading the notes attached to the the one I did see at the ANTLR site; in essence, the author produced an incomplete grammar. And that was for just C++98. It has been awhile since I looked; there may be others.
Our DMS Software Reengineering Toolkit has a robust C++ front end.
The lexer handles all the cruft for ANSI, GCC3, MS Visual Studio 2008, including large-precision floating point numbers, etc.
[EDIT: 12/2011. Now handles C++11 and OpenMP directives]
[EDIT: 3/2015: Now handles C++14 in both GCC and MS variants. See some parse trees here on SO]
Having "just" a parser is actually not very useful. Above and beyond "just parsing", our front end will build ASTs, build accurate symbol tables (for C++, this is extremely hard to do), perform function-local flow analysis, and allow you to carry out program transformations, etc. See Life After Parsing.
[EDIT: 5/2019: Now handles C++17 in ANSI, GCC and MS variants. Does complete name and type resolution across compilation units. Used to automate large scale God-class refactoring/splitting across systems of 3000 compilation units.]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With