Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PEG.js - how to parse c-style comments?

Implementing a peg.js based parser, I get stuck adding code to to handle c-style comments /* like this */.

I need to find the end marker without eating it.

this not working:

multi = '/*' .* '*/'

The message is:

line: 14
Expected "*/" or any character but end of input found.

I do understand why this is not working, but unfortunately I have no clue how to make comment parsing functional.

Here's the code so far:

start = item*

item = comment / content_line

content_line = _ p:content _ {return ['CONTENT',p]}

content = 'some' / 'legal' / 'values'

comment = _ p:(single / multi) {return ['COMMENT',p]}

single = '//' p:([^\n]*) {return p.join('')}

multi = 'TODO'


_ = [ \t\r\n]* {return null}

and some sample input:

// line comment, no problems here

/*
  how to parse this ??
*/

values

// another comment

some legal
like image 936
Gisela Avatar asked Oct 24 '14 21:10

Gisela


2 Answers

complete code:

Parser:

start = item*

item = comment / content_line

content_line = _ p:content _ {return ['CONTENT',p]}

content = 'all' / 'legal' / 'values' / 'Thanks!'

comment = _ p:(single / multi) {return ['COMMENT',p]}

single = '//' p:([^\n]*) {return p.join('')}

multi = "/*" inner:(!"*/" i:. {return i})* "*/" {return inner.join('')}

_ = [ \t\r\n]* {return null}

Sample:

all  

// a comment

values

// another comment

legal

/*12
345 /* 
*/

Thanks!

Result:

[
    ["CONTENT","all"],
    ["COMMENT"," a comment"],
    ["CONTENT","values"],
    ["COMMENT"," another comment"],
    ["CONTENT","legal"],
    ["COMMENT","12\n345 /* \n"],
    ["CONTENT","Thanks!"]
]
like image 175
Gisela Avatar answered Oct 26 '22 02:10

Gisela


Use a predicate that looks ahead and makes sure there is no "*/" ahead in the character stream before matching characters:

comment
 = "/*" (!"*/" .)* "*/"

The (!"*/" .) part could be read as follows: when there's no '*/' ahead, match any character.

This will therefor match comments like this successfully: /* ... **/

like image 31
Bart Kiers Avatar answered Oct 26 '22 02:10

Bart Kiers