Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What's wrong with my brainfuck parser code?

I'm trying to make a program in Java that can read, compile, and run brainfuck source files (.bf). I've gotten it to work just fine with Wikipedia's Hello World example, but it breaks on the ROT13 example (claims it reached an unmatched ] when it was actually matched).

The actual parser code is all written in one .JAVA file, but the heart of it (the actual brainfuck parser and running code) is in the below method, doNow(char). Here are what the variables are: cells is the array of characters to be run (char[]); pointer is the Java workaround to point to an address in the array (short); PC is the program counter (int), and loopStack is a stack of addresses which correspond to [s (basically a short[]). These are not the problem, as they work just fine in the Hello World test. The method that takes input automatically filters out excess characters, and I confirmed it to work perfectly from debug inspection.

Why doesn't this parser run the ROT 13 code?

Code


My parser, written in Java

  /** The array of data */
  private byte[] cells = new byte[Short.MAX_VALUE];
  /** The pointer that is manipulated by the user or program */
  private short pointer = 0;
  /** The program counter, to run compiled programs */
  private int PC = 0;
  /** The compiled commands */
  private ArrayPP<Character> commandBuffer = new ArrayPP<>();
  /** The stack of locations of loop brackets ({@code [}) in the command buffer */
  private ArrayPP<Short> loopStack = new ArrayPP<>();//ArrayPP is my proprietary augmented array object, which also functions as a perfectly working stack.

  public int doNow(char command) throws IOException
  {
    PC++;
    switch (command)
    {
      case '>':
        return ++pointer;
      case '<':
        return --pointer;
      case '+':
        return ++cells[pointer];
      case '-':
        return --cells[pointer];
      case '.':
        System.out.print((char)cells[pointer]);
        return 0;
      case ',':
        return cells[pointer] = (byte)System.in.read();
      case '[':
        if (cells[pointer] == 0)//If we're ready to skip this conditional
        {
          int oldPC = PC;
          try
          {
            while (getCompiledCommand(PC) != ']')//Find the matching ]
              PC++;
            PC++;//Now that we're at the ], skip over it to the next command
          }
          catch (ArrayIndexOutOfBoundsException e)
          {
            throw new NullPointerException("Unmatched '[' at " + oldPC);//If we try to reference a command outside the buffer
          }
        }
        else//if we want to enter this conditional
          loopStack.push(PC - 1);//Add the location of this conditional to the list of conditionals which we are in
        return PC;
      case ']':
        try
        {
          return PC = loopStack.pop();//Move us to the matching [ and remove it from the list of conditionals we're in
        }
        catch (ArrayIndexOutOfBoundsException e)
        {
          throw new NullPointerException("Unmatched ] at " + PC);//If the loop stack is empty
        }
      default:
        throw new AssertionError(command + " is not a valid command.");
    }
  }
  public char getCompiledCommand(int commandIndex)
  {
    return commandBuffer.get(commandIndex);//Look into the buffer of precompiled commands and fetch the one at the given index
  }

The Hello World example (works perfectly)

+++++ +++++             initialize counter (cell #0) to 10
[                       use loop to set the next four cells to 70/100/30/10
    > +++++ ++              add  7 to cell #1
    > +++++ +++++           add 10 to cell #2 
    > +++                   add  3 to cell #3
    > +                     add  1 to cell #4
    <<<< -                  decrement counter (cell #0)
]                   
> ++ .                  print 'H'
> + .                   print 'e'
+++++ ++ .              print 'l'
.                       print 'l'
+++ .                   print 'o'
> ++ .                  print ' '
<< +++++ +++++ +++++ .  print 'W'
> .                     print 'o'
+++ .                   print 'r'
----- - .               print 'l'
----- --- .             print 'd'
> + .                   print '!'
> .                     print '\n'

ROT 13 example (My test console input is M. Breaks on command 54 after several loop iterations)

-,+[                         Read first character and start outer character reading loop
    -[                       Skip forward if character is 0
        >>++++[>++++++++<-]  Set up divisor (32) for division loop
                               (MEMORY LAYOUT: dividend copy remainder divisor quotient zero zero)
        <+<-[                Set up dividend (x minus 1) and enter division loop
            >+>+>-[>>>]      Increase copy and remainder / reduce divisor / Normal case: skip forward
            <[[>+<-]>>+>]    Special case: move remainder back to divisor and increase quotient
            <<<<<-           Decrement dividend
        ]                    End division loop
    ]>>>[-]+                 End skip loop; zero former divisor and reuse space for a flag
    >--[-[<->+++[-]]]<[         Zero that flag unless quotient was 2 or 3; zero quotient; check flag
        ++++++++++++<[       If flag then set up divisor (13) for second division loop
                               (MEMORY LAYOUT: zero copy dividend divisor remainder quotient zero zero)
            >-[>+>>]         Reduce divisor; Normal case: increase remainder
            >[+[<+>-]>+>>]   Special case: increase remainder / move it back to divisor / increase quotient
            <<<<<-           Decrease dividend
        ]                    End division loop
        >>[<+>-]             Add remainder back to divisor to get a useful 13
        >[                   Skip forward if quotient was 0
            -[               Decrement quotient and skip forward if quotient was 1
                -<<[-]>>     Zero quotient and divisor if quotient was 2
            ]<<[<<->>-]>>    Zero divisor and subtract 13 from copy if quotient was 1
        ]<<[<<+>>-]          Zero divisor and add 13 to copy if quotient was 0
    ]                        End outer skip loop (jump to here if ((character minus 1)/32) was not 2 or 3)
    <[-]                     Clear remainder from first division if second division was skipped
    <.[-]                    Output ROT13ed character from copy and clear it
    <-,+                     Read next character
]                            End character reading loop

to make it clear, here's where it breaks:

            >[+[<+>-]>+>>]   Special case: increase remainder / move it back to divisor / increase quotient
                         ^
like image 216
Ky. Avatar asked Jan 15 '23 09:01

Ky.


2 Answers

You should keep track of '[]' nestedness in the '[' case branch: now, the match for the first '[' in [+++[----]+] is the first ']', which is not good.

like image 166
Dmytro Sirenko Avatar answered Jan 21 '23 08:01

Dmytro Sirenko


Problem


It seems the problem lies in this line:

            while (getCompiledCommand(PC) != ']')//Find the matching ]

This works fine in the Hello World program because it has no nested loops. However, with nested loops, we run into the problem that it hits the first encountered ], which is not always the matching ].

Fix


One possible fix is to introduce a variable before the while loop, say loopCount, and increment it every time a [ is encountered, then decrement it when a ] is encountered and loopCount is greater than 0. For instance:

        int loopCount = 0;
        while ((command = getCompiledCommand(PC)) != ']' && loopCount == 0)//Find the matching ]. We can save the return in command because we're done using it.
        {
          if (command == '[')//If we run into a nested loop
            loopCount++;
          else if (command == ']')//If we run into the end of a nested loop
            loopCount--;

          PC++;
        }
like image 45
Ky. Avatar answered Jan 21 '23 08:01

Ky.