I want to build a Brainfuck (Damn that name) interpreter in my freshly created programming language to prove it's turing-completeness.
Now, everything is clear so far (<>+-,.
) - except one thing: The loops ([]
).
I assume that you know the (extremely hard) BF syntax from here on:
How could the pseudocode look like? What should I do when the interpreter reaches a loop beginning ([
) or a loop end (]
)?
Checking if the loop should continue or stop is not the problem (current cell==0
), but:
As loops can be nested I suppose that I can't just use a variable containing the starting position of the current loop.
I've seen very small BF interpreters implemented in various languages, I wonder how they managed to get the loops working but can't figure it out.
When you reach [
, you test the data pointer.
If it's false, you can scan for the next matched ]
character, counting up how many [
you see and making sure you mark them off as you see each ]
.
If it's true, you need to keep track of its position so you can jump back to it later. I suggest using a stack. Push the current program position onto the stack, then when you reach ]
, test the data pointer. If it's true, go to the topmost program position on the stack. If it's false, pop the position off the stack and continue.
As you nest into inner loops, the stack will cleanly record the context of each loop.
See stack (wikipedia). This is analogous to how assembly programs deal with function calls.
Here is my "optimizing" version of interpreter, that pre-compiles the loop jumps.
def interpret2(code):
data = [0] * 5000 # data memory
cp = 0 # code pointer
dp = 0 # data pointer
# pre-compile a jump table
stack = []
jump = [None] * len(code)
for i,o in enumerate(code):
if o=='[':
stack.append(i)
elif o==']':
jump[i] = stack.pop()
jump[jump[i]] = i
# execute
while cp < len(code):
cmd = code[cp]
if cmd == '>': dp += 1
elif cmd == '<': dp -= 1
elif cmd == '+': data[dp] += 1
elif cmd == '-': data[dp] -= 1
elif cmd == '.': stdout.write(chr(data[dp]))
elif cmd == ',': data[dp] = ord(stdin.read(1))
elif cmd == '[' and not data[dp]: # skip loop if ==0
cp = jump[cp]
elif cmd == ']' and data[dp]: # loop back if !=0
cp = jump[cp]
cp += 1
It does a dry run of the code, keeping track of the brackets (in a stack) and marks the goto addresses in parallel jump
array which is later consulted during execution.
I compared the execution speed on long-running BF program (calculate N digits of Pi) and this increased the speed 2x compared to an innocent implementation in which source is scanned forward to exit [
and scanned backwards to loop on ]
.
How do I implement the BF loops in my interpreter?
That’s the point – it entirely depends on your language. For the case of a stack-based programming language (or any language which can use a stack), @rjh has given a good solution. Other languages would use different solutions, such as recursion (i.e. implicit use of a stack).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With