Automated GOTO removal algorithm

Tags:

I've heard that it's been proven theoretically possible to express any control flow in a Turing-complete language using only structured programming constructs, (conditionals, loops and loop-breaks, and subroutine calls,) without any arbitrary GOTO statements. Is there any way to use that theory to automate refactoring of code that contains GOTOs into code that does not?

Let's say I have an arbitrary single subroutine in a simple imperative language, such as C or Pascal. I also have a parser that can verify that this subroutine is valid, and produce an Abstract Syntax Tree from it. But the code contains GOTOs and Labels, which could jump forwards or backwards to any arbitrary point, including into or out of conditional or loop blocks, but not outside of the subroutine itself.

Is there an algorithm that could take this AST and rework it into new code which is semantically identical, but does not contain any Labels or GOTO statements?

832

asked Dec 27 '12 22:12

Mason Wheeler

3 Answers

In principle, it is always possible to do this, though the results might not be pretty.

One way to always eliminate gotos is to transform the program in the following way. Start off by numbering all the instructions in the original program. For example, given this program:

start:
    while (true) {
        if (x < 5) goto start;
        x++
    }

You could number the statements like this:

0 start:
1     while (x < 3) {
2         if (x < 5) goto start;
3         x++
      }

To eliminate all gotos, you can simulate the flow of the control through this function by using a while loop, an explicit variable holding the program counter, and a bunch of if statements. For example, you might translate the above code like this:

int PC = 0;
while (PC <= 3) {
    if (PC == 0) {
         PC = 1;             // Label has no effect
    } else if (PC == 1) {
         if (x < 3) PC = 4;  // Skip loop, which ends this function.
         else PC = 2;        // Enter loop.
    } else if (PC == 2) {
         if (x < 5) PC = 0;  // Simulate goto
         else PC = 3;        // Simulate if-statement fall-through
    } else if (PC == 3) {
         x++;
         PC = 1;             // Simulate jump back up to the top of the loop.
    }
}

This is a really, really bad way to do the translation, but it shows that in theory it is always possible to do this. Actually implementing this would be very messy - you'd probably number the basic blocks of the function, then generate code that puts the basic blocks into a loop, tracks which basic block is currently executing, then simulates the effect of running a basic block and the transition from that basic block to the appropriate next basic block.

Hope this helps!

answered Nov 04 '22 15:11

templatetypedef

I think you want to read Taming Control Flow by Erosa and Hendren, 1994. (Earlier link on Google scholar).

By the way, loop-breaks are also easy to eliminate. There is a simple mechanical procedure involving the creating of a boolean state variable and the restructuring of nested conditionals to create straight-line control flow. It does not produce pretty code :)

If your target language has tail-call optimization (and, ideally, inlining), you can mechanically remove both break and continue by turning the loop into a tail-recursive function. (If the index variable is modified by the loop body, you need to work harder at this. I'll just show the simplest case.) Here's the transformation of a simple loop:

for (Type Index = Start;        function loop(Index: Type):    
     Condition(Index);              if (Condition)
     Index = Advance(Index)){           return                      // break
   Body                             Body
}                                   return loop(Advance(Index))     // continue
                                loop(Start)

The return statements labeled "continue" and "break" are precisely the transformation of continue and break. Indeed, the first step in the procedure might have been to rewrite the loop into its equivalent form in the original language:

{
    Type Index = Start;
    while (true) {
        if (!Condition(Index))
            break;
        Body;
        continue;
    }
}

answered Nov 04 '22 17:11

rici

I use either/both Polyhedron's spag and vast's 77to90 to begin the process of refactoring fortran and then converting it to matlab source. However, these tools always leave 1/4 to 1/2 of the goto's in the program.

I wrote up a goto remover which accomplishes something similar to what you were describing: it takes fortran code and refactors all the remaining goto's from a program and replacing them with conditionals and do/cycle/exit's which can then be converted into other languages like matlab. You can read more about the process I use here:

http://engineering.dartmouth.edu/~d30574x/consulting/consulting_gotorefactor.html

This program could be adapted to work with other languages, but I have not gotten than far yet.

answered Nov 04 '22 17:11

Ben Barrowes

Related questions
                            
                                design a system supporting massive data storage and query
                            
                                Longest path in a DAG
                            
                                Explain this dynamic programming climbing n-stair code
                            
                                Puzzle: Find the order of n persons standing in a line (based on their heights)
                            
                                QuickSort Dijkstra 3-Way Partitioning: why the extra swapping?
                            
                                How to straighten unneeded turns in a A* graph search result?
                            
                                Time complexity of inserting in to a heap
                            
                                Generating Luhn Checksums
                            
                                Image palette reduction
                            
                                Complexity of algorithms of different programming paradigms
                            
                                Time delay estimation between two audio signals
                            
                                Algorithm for Rating Objects Based on Amount of Votes and 5 Star Rating
                            
                                I wonder how reverse image search services like tineye.com work ...?
                            
                                Efficient algorithm for counting unique states of tic tac toe
                            
                                support vector machines - a simple explanation?
                            
                                Best way to track maximal distance in a set of points?
                            
                                Is there a good way to do this type of mining?
                            
                                How to approximate the count of distinct values in an array in a single pass through it
                            
                                Java.util.ArrayList.sort() sorting algorithm
                            
                                Data structure for efficient percentile lookups?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Automated GOTO removal algorithm

Tags:

language-agnostic

algorithm

goto