How to parse in easy way a .h file written in C for comments and entity names using Python?
We're suppose for a further writing the content into the word file already developed.
Source comments are formatted using a simple tag-style rules. Comment tags used for an easy distinguishing one entity comment from the other and non-documenting comments. A comment could be in multi-line form. An each comment have stay straight upon the entity definition:
//ENUM My comment bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla
// could be multi-line. Bla bla bla bla bla bla bla bla bla.
enum my_enum
{
//EITEM My enum item 1.
// Just could be multi-line too.
MY_ENUM_ITEM_1,
//EITEM My enum item 2
MY_ENUM_ITEM_2,
};
//STRUCT My struct
struct my_struct {
//MEMBER struct member 1
int m_1_;
};
//FUNC my function 1 description.
// Could be multi-line also.
//INPUT arg1 - first argument
//RETURN pointer to an allocated my_struct instance.
my_struct* func_1(int arg1);
A code-and-comments tree should come out as a result of this parsing.
How does one make it quickly and without using third-party libraries?
This has already been done. Several times over.
Here is a parser for the C language written in Python. Start with this.
http://wiki.python.org/moin/SeeGramWrap
Other parsers.
http://wiki.python.org/moin/LanguageParsing
http://nedbatchelder.com/text/python-parsers.html
You could probably download any ANSI C Yacc grammar and rework it into PLY format without too much trouble and use that as a jumping-off point.
Here's a quick and dirty solution. It won't handle comments in strings, but since this is just for header files that shouldn't be an issue.
S_CODE,S_INLINE,S_MULTLINE = range (3) f = open (sys.argv[1]) state = S_CODE comments = '' i = iter (lambda: f.read (1), '') while True: try: c = i.next () except StopIteration: break if state == S_CODE: if c == '/': c = i.next () if c == '*': state = S_MULTLINE elif c == '/': state = S_INLINE elif state == S_INLINE: comments += c if c == '\n': state == S_CODE elif state == S_MULTLINE: if c == '*': c = i.next () if c == '/': comments += '\n' state = S_CODE else: comments += '*%s' % c else: comments += c print comments
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With