Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Implementing Brainfuck loops in an interpreter

I want to build a Brainfuck (Damn that name) interpreter in my freshly created programming language to prove it's turing-completeness.

Now, everything is clear so far (<>+-,.) - except one thing: The loops ([]). I assume that you know the (extremely hard) BF syntax from here on:

  • How do I implement the BF loops in my interpreter?

How could the pseudocode look like? What should I do when the interpreter reaches a loop beginning ([) or a loop end (])?

Checking if the loop should continue or stop is not the problem (current cell==0), but:

  • When and where do I have to check?
  • How to know where the loop beginning is located?
  • How to handle nested loops?

As loops can be nested I suppose that I can't just use a variable containing the starting position of the current loop.

I've seen very small BF interpreters implemented in various languages, I wonder how they managed to get the loops working but can't figure it out.

like image 837
sub Avatar asked Apr 06 '10 20:04

sub


3 Answers

When you reach [, you test the data pointer.

If it's false, you can scan for the next matched ] character, counting up how many [ you see and making sure you mark them off as you see each ].

If it's true, you need to keep track of its position so you can jump back to it later. I suggest using a stack. Push the current program position onto the stack, then when you reach ], test the data pointer. If it's true, go to the topmost program position on the stack. If it's false, pop the position off the stack and continue.

As you nest into inner loops, the stack will cleanly record the context of each loop.

See stack (wikipedia). This is analogous to how assembly programs deal with function calls.

like image 68
rjh Avatar answered Oct 22 '22 00:10

rjh


Here is my "optimizing" version of interpreter, that pre-compiles the loop jumps.

def interpret2(code):
    data = [0] * 5000   # data memory
    cp = 0              # code pointer
    dp = 0              # data pointer
    # pre-compile a jump table
    stack = []
    jump = [None] * len(code)
    for i,o in enumerate(code):
        if o=='[':
            stack.append(i)
        elif o==']':
            jump[i] = stack.pop()
            jump[jump[i]] = i
    # execute
    while cp < len(code):
        cmd = code[cp]
        if   cmd == '>': dp += 1
        elif cmd == '<': dp -= 1
        elif cmd == '+': data[dp] += 1 
        elif cmd == '-': data[dp] -= 1 
        elif cmd == '.': stdout.write(chr(data[dp]))
        elif cmd == ',': data[dp] = ord(stdin.read(1))
        elif cmd == '[' and not data[dp]: # skip loop if ==0
            cp = jump[cp]
        elif cmd == ']' and data[dp]:     # loop back if !=0
            cp = jump[cp]
        cp += 1

It does a dry run of the code, keeping track of the brackets (in a stack) and marks the goto addresses in parallel jump array which is later consulted during execution.

I compared the execution speed on long-running BF program (calculate N digits of Pi) and this increased the speed 2x compared to an innocent implementation in which source is scanned forward to exit [ and scanned backwards to loop on ].

like image 7
Nas Banov Avatar answered Oct 22 '22 00:10

Nas Banov


How do I implement the BF loops in my interpreter?

That’s the point – it entirely depends on your language. For the case of a stack-based programming language (or any language which can use a stack), @rjh has given a good solution. Other languages would use different solutions, such as recursion (i.e. implicit use of a stack).

like image 2
Konrad Rudolph Avatar answered Oct 22 '22 00:10

Konrad Rudolph