Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Ignore whitespace with PEG.js

Tags:

I want to ignore whitespaces and new lines with my grammar so they are missing in the PEG.js output. Also, a literal within brackets should be returned in a new array.

Grammar

start
  = 'a'? sep+ ('cat'/'dog') sep* '(' sep* stmt_list sep* ')'

stmt_list
  = exp: [a-zA-Z]+ { return new Array(exp.join('')) }

sep
  = [' '\t\r\n]

Test case

a dog( Harry )

Output

[
   "a",
   [
      " "
   ],
   "dog",
   [],
   "(",
   [
      " "
   ],
   [
       "Harry"
   ],
   [
      " "
   ],
   ")"
]

Output I want

[
   "a",
   "dog",
   [
      "Harry"
   ]
]
like image 485
Matthias Avatar asked Nov 24 '11 12:11

Matthias


1 Answers

You have to break up the grammar more, using more "non-terminals" (not sure if that's what you call them in a PEG):

start
  = article? animal stmt_list

article
  = article:'a' __ { return article; }

animal
  = animal:('cat'/'dog') _ { return animal; }

stmt_list
  = '(' _ exp:[a-zA-Z]+ _ ')' { return [ exp.join('') ]; }

// optional whitespace
_  = [ \t\r\n]*

// mandatory whitespace
__ = [ \t\r\n]+

Thanks for asking this question!

Edit: To increase readability, have two productions: _ and __

like image 79
Pointy Avatar answered Oct 07 '22 08:10

Pointy