How should I do program in lex (or flex) for removing nested comments from text and print just the text which is not in comments? I should probably somehow recognize states when I am in comment and number of starting "tags" of block comment.
Lets have rules:
1.block comment
/*
block comment
*/
2. line comment
// line comment
3. Comments can be nested.
Example 1
show /* comment /* comment */ comment */ show
output:
show show
Example 2
show /* // comment
comment
*/
show
output:
show
show
Example 3
show
///* comment
comment
// /*
comment
//*/ comment
//
comment */
show
output:
show
show
You got the theory right. Here's a simple implementation; could be improved.
%x COMMENT
%%
%{
int comment_nesting = 0;
%}
"/*" BEGIN(COMMENT); ++comment_nesting;
"//".* /* // comments to end of line */
<COMMENT>[^*/]* /* Eat non-comment delimiters */
<COMMENT>"/*" ++comment_nesting;
<COMMENT>"*/" if (--comment_nesting == 0) BEGIN(INITIAL);
<COMMENT>[*/] /* Eat a / or * if it doesn't match comment sequence */
/* Could have been .|\n ECHO, but this is more efficient. */
([^/]*([/][^/*])*)* ECHO;
%%
This is exactly what you need : yy_push_state(COMMENT)
Its uses a stack to store our states which comes handy in nested situations.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With