Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to read inline comments from java file using javac tools parser?

I am using javac from tools.jar (i.e. JavaCompiler) for parsing java files. I parse the sources using the implementation of TreePathScanner. So far, everything seems fine, as I can parse imports, package name, class name, method name, instructions...

But i do have problem with inline comments - I can not simply make them appear in the created AST tree, or visited them. I am able, however, to read javadoc comments for classes, methods etc, but no inline comments.

How to read inline comments in the best way? I am looking at the netbeans source code (as it is using javac for parsing, too), but i can not find easily anything about it.

My desperate solution would be using statements position of the source file, and then manually parse for comment everything whats between two statement. Or similar thing, but between two tree nodes.

Does anyone know better solution? Thank you!

like image 646
igr Avatar asked Feb 01 '13 22:02

igr


2 Answers

You can't. The compiler throws them away. Compilers always do that. The Java compiler doesn't throw away Javadoc comments only because Javadoc uses the compiler to find them and the Javadoc guys got together with the compiler guys.

like image 158
user207421 Avatar answered Oct 12 '22 22:10

user207421


A key difference between a "compiler parser" and a "reengineering parser" has to do with what information is captured about the layout, comments, and formats of literals. As others have observed, most compilers throw all this information away, as it isn't germane to compiling down to low-level code.

Similarly, classic parser generators (such as JavaCC, ANTLR, etc.) offer very little in support of capturring/regenerating this information.

Reengineering parsers, in contrast, are used to analyze code and comments, sometimes even to revise the code without losing (or appropriately revising comments). For code analysis with comments, you can't throw away the comments :-} For code modification, if you regenerate changed code based on original, it is nice if the changed code preserves code layout, comments, and literal "formats" (e.g., regnerating a hex literal as a decimal value is legal and equivalent, but makes the original authors pretty unhappy). To do this, reengineering parsers need special lexers to capture all this data, and parsing machinery that doesn't throw it away.

Our DMS Software Reengineering Toolkit includes, well, a reengineering parser as generic machinery; DMS-based parsers exist for a wide variety of languages (including OP's interest in Java). DMS captures all the comments/layout/formatting information. Analysis tools have access to it all.

TXL and Stratego provide some support for this, too.

like image 24
Ira Baxter Avatar answered Oct 12 '22 23:10

Ira Baxter