In order to try to implement a PEG in JavaScript that doesn't make older browsers crash from stack overflows, I would like to make a parsing expression grammar that parses a string in a non-recursive way. How do you do this? It feels mind bending.
Say you have a structure like this:
grammar
has many expressionsexpression
has many matchers
matcher
has many tokens
(or whatever is a better word for it)token
can either point to another expression
, or be a primitive string or regular expression. So if it points to another expression, this is where the recursion starts.So say you define the hierarchy like this:
var grammar = new Grammar('math');
var expression = grammar.expression;
expression('math')
.match(':number', ':operator', ':number', function(left, operator, right){
switch (operator) {
case '+': return left + right;
case '-': return left - right;
case '*': return left * right;
case '/': return left / right;
}
});
expression('number')
.match(/\d+/, parseInt);
expression('operator')
.match('+')
.match('-')
.match('*')
.match('/');
var val = grammar.parse('6*8'); // 42
When you call grammar.parse
, it starts at the root expression (which has the same name as it, so "math"). Then it iterates through each matcher, then each token, and if the token is an expression, recurses. Basically this (the parser will keep track of the offset/position of the string it's matching the patterns against; this is just pseudocode):
function parse(str, expression, position) {
var result = [];
expression.matchers.forEach(function(matcher){
matcher.tokens.forEach(function(token){
var val;
if (token.expression) {
val = parse(str, token.expression, position);
} else {
val = token.parse(str, position);
}
if (val) result.push(val);
});
});
return result;
}
parse('6*8', grammar.root, 0);
So for a simple expression like 6*8
there's very little recursion, but you could quickly get to a complex expression with many levels of nesting. Plus multiply the nesting by all those nested for loops, and the stack becomes large (I don't actually use forEach
, I use for loops, but in the for loop it calls a function most of the time, so it ends up being pretty much the same).
The question is, how do you "flatten this out"? Instead of doing recursion, how do you make this so it's essentially like this:
while (token = stack.pop()) {
val = token.parse(val);
if (val) result.push(val);
}
I'm not looking for the details of how to implement a solution to this specific PEG problem, I am more just looking for the general way you keep track of the recursive state in a non-recursive way.
The stack is used to keep track of what function called which other function. In the example, without a stack, when line 2 is executed, how does it know whether to return to line 6 or line 7? The stack is used to store where to return to.
Implementation. Many programming languages implement recursion by means of stacks. Generally, whenever a function (caller) calls another function (callee) or itself as callee, the caller function transfers execution control to the callee.
Explanation: Stacks are used for the implementation of Recursion.
Recursive functions are procedures or subroutines implemented in a programming language, whose implementation references itself. Non Recursive Function are procedures or subroutines implemented in a programming language, whose implementation does not references itself.
In general, what you do is you write a stack in code and you put your “local” variables in a “stack frame” context object that you keep on that stack. Then, where you would have a “recursive call”, you store the current stack frame and create a new one for the new current context. Doing the “return” is just a matter of reversing the operation. It's not especially complicated, but it does make the code a bit of a mess. The only thing to watch out for is that you get to the bottom of the stack at the point when you've finished parsing the expression (so that trailing tokens and missing tokens don't cause problems).
It's very much like what happens with a stack maintained in machine code, except you're not restricted to primitive values and can make things quite a lot neater (at the data structure level) as a consequence.
If you've got the time, consider writing (or using someone else's) LR(1) parser. Those maintain very little system stack and handle a number of evil cases in grammars better than your home-rolled LL(k) grammar. However, they're considerably more mysterious in how they work than what you've got now.
I am more just looking for the general way you keep track of the recursive state in a non-recursive way.
Use pushing and popping in stacks (arrays).
And it would be easier if you had goto's.
A (factorial) approach in VBA (more clear because of goto's).
Option Explicit
Sub Main()
MsgBox fac(1)
MsgBox fac(5)
End Sub
Function fac(n&)
Dim answer&, level&, stackn&(99)
level = 0
zentry:
If n = 1 Then answer = 1: GoTo zreturn
level = level + 1 ' push n
stackn(level) = n
n = n - 1 ' call fac(n-1)
GoTo zentry
zreturn:
If level = 0 Then fac = answer: Exit Function
n = stackn(level) ' pop n
level = level - 1
answer = n * answer ' factorial
GoTo zreturn
End Function
The same approach in javascript.
console.log(fac(1));
console.log(fac(5));
function fac(n) { // non-recursive
var answer, level;
var stackn = [];
level = 0;
while (true) { // no goto's
if (n == 1) { answer = 1; break; }
level = level + 1; // push n
stackn[level] = n;
n = n - 1; } // call fac(n-1)
while (true) { // no goto's
if (level == 0) { return answer; }
n = stackn[level]; // pop n
level = level - 1;
answer = n * answer; } // factorial
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With