I've written a pegjs grammar that is supposed to parse any kind of js/c-style comments. However, it's not quite working since I've only managed to capture the comment itself, and ignore everything else. How should I alter this grammar to only parse comments out of any kind of input?
Grammar:
Start
= Comment
Character
= .
Comment
= MultiLineComment
/ SingleLineComment
LineTerminator
= [\n\r\u2028\u2029]
MultiLineComment
= "/*" (!"*/" Character)* "*/"
MultiLineCommentNoLineTerminator
= "/*" (!("*/" / LineTerminator) Character)* "*/"
SingleLineComment
= "//" (!LineTerminator Character)*
Input:
/**
* Trending Content
* Returns visible videos that have the largest view percentage increase over
* the time period.
*/
Other text here
Error
Line 5, column 4: Expected end of input but "\n" found.
You need to refactor to specifically capture the line content before you consider the comment (either single or multiple line), as in:
lines = result:line* {
return result
}
line = WS* line:$( !'//' CHAR )* single_comment ( EOL / EOF ) { // single-comment line
return line.replace(/^\s+|\s+$/g,'')
}
/ WS* line:$( !'/*' CHAR )* multi_comment ( EOL / EOF ) { // mult-comment line
return line.replace(/^\s+|\s+$/g,'')
}
/ WS* line:$CHAR+ ( EOL / EOF ) { // non-blank line
return line.replace(/^\s+|\s+$/g,'')
}
/ WS* EOL { // blank line
return ''
}
single_comment = WS* '//' CHAR* WS*
multi_comment = WS* '/*' ( !'*/' ( CHAR / EOL ) )* '*/' WS*
CHAR = [^\n]
WS = [ \t]
EOF = !.
EOL = '\n'
which, when run against:
no comment here
single line comment // single-comment HERE
test of multi line comment /*
multi-comment HERE
*/
last line
returns:
[
"no comment here",
"",
"single line comment",
"",
"test of multi line comment",
"",
"last line"
]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With