I'm working on a C project that has seen many different authors and many different documentation styles.
I'm a big fan of doxygen and other documentation generations tools, and I would like to migrate this project to use one of these systems.
Is anybody aware of a tool that can scan source code comments for keywords like "Description", "Author", "File Name" and other sorts of context to intelligently convert comments to a standard format? If not I suppose I could write a crazy script, or convert manually.
Thanks
The only one I can think of when I read the O'Reilly's book on Lex + Yacc, was that there was code to output the comments on the command line, there was a section in chapter 2 that shows how to parse the code for comments including the //
and /*..*/
...There's a link on the page for examples, download the file progs.zip, the file you're looking for is ch2-09.l
which needs to be built, it can be easily modified to output the comments. Then that can be used in a script to filter out 'Name', 'Description' etc...
I can post the instructions here on how to do this if you are interested?
Edit: I think I have found what you are looking for, a prebuilt comment documentation extractor here.
I think as tommieb75 suggests, a proper parser is the way to handle this.
I'd suggest looking at ANTLR, since it supports re-writing the token buffer in-place, which I think would minimise what you have to do to preserve whitespace etc - see chapter 9.7 of The Definitive ANTLR reference.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With