(Note: I've read other questions like this, but I haven't been able to figure this out).
I wrote this grammar:
start = call
ident = [a-z]+
spaces = [ ]+
call = f:ident spaces g:(call / ident) {
return f + "(" + g + ")";
}
With this input
a b c d
it returns
"a(b(c(d)))"
And I want
"a(b)(c)(d)"
I think this left recursive rule can give me something like that, but PEG.js doesn't support left recursion.
call = f:(call / ident) spaces g:ident {
return f + "(" + g + ")";
}
How can I eliminate the left recursion in this case?
PS: You can test this on the online PEG.js demo
A Grammar G (V, T, P, S) is left recursive if it has a production in the form. A → A α |β. Left Recursion can be eliminated by introducing new non-terminal A such that.
Removing left recursion Left recursion often poses problems for parsers, either because it leads them into infinite recursion (as in the case of most top-down parsers) or because they expect rules in a normal form that forbids it (as in the case of many bottom-up parsers, including the CYK algorithm).
Which one of the following grammars is free from left recursion? Explanation: Grammar A has direct left recursion because of the production rule: A->Aa. Grammar B doesn't have any left recursion (neither direct nor indirect).
A production can be recursive without being either left- or right-recursive, and it can even be both left- and right-recursive. The above examples are both direct recursion; the non-terminal directly derives a sequence containing the non-terminal. Recursion can also be indirect; it is still recursion.
Good question. Start by separating your first ident
from everything else, since it gets special treatment (no parentheses). Next, defer to a rule to handle the spaces ident
recursion that will collect the values that go inside parentheses. The loop wraps the ident
text and appends any new text that is collected recursively.
Here is a short-hand version of the rules (note that tail
is a separate rule):
head: ident tail?; //the "head" ident is separated
tail: spaces ident tail?; //each "tail" ident is looped over
Here is the PEG script:
start = head
ident = [a-z]+
spaces = [ ]+
head = head:ident tail:tail? {
return head + tail;
}
tail = spaces next:ident tail:tail? {
return "(" + next + ")" + tail
}
Edit: Here is an alternative that does the work in one rule and is more similar to yours.
start = head
ident = [a-z]+
spaces = [ ]+
head = head:ident tail:(spaces next:ident{return "(" + next + ")" })* {
return head + tail.join("")
}
The output for a b c d
is "a(b)(c)(d)"
for both scripts.
If I understand correctly, your problem isn't left recursion, it's the structure of the parse tree.
You've eliminated the left recursion properly, but unfortunately, the only way to get rid of left recursion is by eliminating left recursion in the raw parse tree. Most of the theory for this stuff is just about matching the right set of strings. You are still matching the same set of strings, so the theory is happy, but you wanted a left recursive parse tree. More on that problem on wikipedia.
AFAIK, You can't get the raw output of the PEG parser to be left-recursive. You can, however, do whatever you want with the output. So parse it as an array and then postprocess to give it that nice left-structure.
Doing it with a simplified (no spaces, no multicharacter identifiers) grammar:
start = call
id = [a-z]
call
= arr:id+ {
var acc = arr[0]
for (i = 1; i < arr.length; i++) {
acc = [acc, arr[i]]
}
return acc;
}
That parses abcd
to [ [ [ 'a', 'b' ], 'c' ], 'd' ]
. I just used +
instead of recursion, and then ran through the resulting array building the structure we wanted. Wikipedia has some notes on doing left recursion with a PEG.
That's assuming you wanted the data structure. If you just want the parens, replace the action with this:
var acc = arr[0]
for (i = 1; i < arr.length; i++) {
acc = acc + '(' + arr[i] + ')'
}
return acc;
Which gives a(b)(c)(d)
.
To put spaces and multicharacter id's back, you can do this:
start = call
id = [a-z]+
_ = [ ]+
call
= a:id as:arg* {
arr = [a].concat(as)
var acc = arr[0]
for (i = 1; i < arr.length; i++) {
acc = acc + '(' + arr[i] + ')'
}
return acc;
}
arg = _ a:id {return a}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With