Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Automated GOTO removal algorithm

I've heard that it's been proven theoretically possible to express any control flow in a Turing-complete language using only structured programming constructs, (conditionals, loops and loop-breaks, and subroutine calls,) without any arbitrary GOTO statements. Is there any way to use that theory to automate refactoring of code that contains GOTOs into code that does not?

Let's say I have an arbitrary single subroutine in a simple imperative language, such as C or Pascal. I also have a parser that can verify that this subroutine is valid, and produce an Abstract Syntax Tree from it. But the code contains GOTOs and Labels, which could jump forwards or backwards to any arbitrary point, including into or out of conditional or loop blocks, but not outside of the subroutine itself.

Is there an algorithm that could take this AST and rework it into new code which is semantically identical, but does not contain any Labels or GOTO statements?

like image 832
Mason Wheeler Avatar asked Dec 27 '12 22:12

Mason Wheeler


People also ask

Why is goto frowned upon?

The reason that many programmers frown upon the use of goto is that with the unrestrained use of goto statements, it is easy to create a program with undefined program flow, which can never be debugged.

What are structured alternatives of goto?

As an alternative, you can always use a command exit which causes an immediate termination of the running loop and continuation your program after the corresponding enddo identifier. Another useful command is cycle which stops executing the current loop and goes back to the header of the loop to initiate a new cycle.

Is goto deprecated?

Use of goto LABEL or goto EXPR to jump into a construct is deprecated and will issue a warning. Even then, it may not be used to go into any construct that requires initialization, such as a subroutine, a foreach loop, or a given block.

Why is goto not reliable?

This is because C being a highly structured language one must take care not to use too much of these unconditional goto branching statements which makes the program a poor way of coding. The goto statement is discouraged in C, because it alters the sequential flow of logic that is the characteristic of C language.


3 Answers

In principle, it is always possible to do this, though the results might not be pretty.

One way to always eliminate gotos is to transform the program in the following way. Start off by numbering all the instructions in the original program. For example, given this program:

start:
    while (true) {
        if (x < 5) goto start;
        x++
    }

You could number the statements like this:

0 start:
1     while (x < 3) {
2         if (x < 5) goto start;
3         x++
      }

To eliminate all gotos, you can simulate the flow of the control through this function by using a while loop, an explicit variable holding the program counter, and a bunch of if statements. For example, you might translate the above code like this:

int PC = 0;
while (PC <= 3) {
    if (PC == 0) {
         PC = 1;             // Label has no effect
    } else if (PC == 1) {
         if (x < 3) PC = 4;  // Skip loop, which ends this function.
         else PC = 2;        // Enter loop.
    } else if (PC == 2) {
         if (x < 5) PC = 0;  // Simulate goto
         else PC = 3;        // Simulate if-statement fall-through
    } else if (PC == 3) {
         x++;
         PC = 1;             // Simulate jump back up to the top of the loop.
    }
}

This is a really, really bad way to do the translation, but it shows that in theory it is always possible to do this. Actually implementing this would be very messy - you'd probably number the basic blocks of the function, then generate code that puts the basic blocks into a loop, tracks which basic block is currently executing, then simulates the effect of running a basic block and the transition from that basic block to the appropriate next basic block.

Hope this helps!

like image 69
templatetypedef Avatar answered Nov 04 '22 15:11

templatetypedef


I think you want to read Taming Control Flow by Erosa and Hendren, 1994. (Earlier link on Google scholar).

By the way, loop-breaks are also easy to eliminate. There is a simple mechanical procedure involving the creating of a boolean state variable and the restructuring of nested conditionals to create straight-line control flow. It does not produce pretty code :)

If your target language has tail-call optimization (and, ideally, inlining), you can mechanically remove both break and continue by turning the loop into a tail-recursive function. (If the index variable is modified by the loop body, you need to work harder at this. I'll just show the simplest case.) Here's the transformation of a simple loop:

for (Type Index = Start;        function loop(Index: Type):    
     Condition(Index);              if (Condition)
     Index = Advance(Index)){           return                      // break
   Body                             Body
}                                   return loop(Advance(Index))     // continue
                                loop(Start)

The return statements labeled "continue" and "break" are precisely the transformation of continue and break. Indeed, the first step in the procedure might have been to rewrite the loop into its equivalent form in the original language:

{
    Type Index = Start;
    while (true) {
        if (!Condition(Index))
            break;
        Body;
        continue;
    }
}
like image 32
rici Avatar answered Nov 04 '22 17:11

rici


I use either/both Polyhedron's spag and vast's 77to90 to begin the process of refactoring fortran and then converting it to matlab source. However, these tools always leave 1/4 to 1/2 of the goto's in the program.

I wrote up a goto remover which accomplishes something similar to what you were describing: it takes fortran code and refactors all the remaining goto's from a program and replacing them with conditionals and do/cycle/exit's which can then be converted into other languages like matlab. You can read more about the process I use here:

http://engineering.dartmouth.edu/~d30574x/consulting/consulting_gotorefactor.html

This program could be adapted to work with other languages, but I have not gotten than far yet.

like image 28
Ben Barrowes Avatar answered Nov 04 '22 17:11

Ben Barrowes