Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Specifically what does a compiler do to aggressively optimize generated bytecode?

I have been reading up on the functionality of various compilers and I've come across the term "aggressive optimization" that many compilers are reported to perform. LLVM, for example cites the following compile-time optimization features:

  • Memory/pointer specific
  • Loop transforms
  • Data flow
  • Arithmetic
  • Dead code elimination
  • Inlining

What does this mean specifically? Say you had the following code snippet, how could you optimize the generated byte code to run any faster than what the compiler generated? I'm specifically interested in optimizing the bytecode of JIT-powered runtimes such as C#, Java and Flash. This is tricky because the JIT only supports a subset of the opcodes that the processor usually does, which limits the amount of optimization you can do. Still, I'm interested to see whats possible and exactly what transformations could push the limits of the VM.

Fictitious block of code:

for (i = 0; i < 100; i++){
    in = dataIn[i];
    if ((in % 5) == 0){
        out = ((in / 2) >> 16) - 10;
    }else{
        out = ((in << 5) / 2) * 50 + 10;
    }
    dataOut[i] = out;
}

Approximate pseudo code generated by the compiler, for a stack-based JIT VM such as Flash Player: (forgive me for any mistakes, this is entirely handwritten!)

// i = 0
label: "forInit"
   push 0
   writeTo "i"

// while i < 100
label: "forStart"
   push "i"
   push 100
   jumpIfMoreThan "forEnd"

       // in = dataIn[i];
       push "i"
       push "dataIn"
       readProp
       saveTo "in"

       // if ((in % 5) == 0)
       push "in"
       push 5
       mod
       push 0
       jumpIfNotEquals "ifPart2"
       label: ifPart1

           // out = ((in / 2) >> 16) - 10;
           push "in"
           push 2
           divide
           push 16
           rightshift
           push 10
           minus
           writeTo "out"
           goto "ifEnd"

       // else
       label: ifPart2

           // out = ((in << 5) / 2) * 50 + 10;
           push "in"
           push 5
           leftshift
           push 2
           divide
           push 50
           multiply
           push 10
           add
           writeTo "out"

       // dataOut[i] = out;
       label: ifEnd
           push "out"
           push "i"
           push "dataOut"
           writeProp

       // i++
       push "i"
       increment
       writeTo "i"

   // while i < 100
   goto "forStart"
label: "forEnd"
like image 595
Robin Rodricks Avatar asked Jan 16 '23 09:01

Robin Rodricks


1 Answers

Although this does not answer your question, I came across the following transformations that a C++ compiler performs to optimize the generated machine code:

  • Strength Reduction --- iteration variables used as data indices are incremented at a rate matched to the size of the data unit
  • Hidden Paremeters --- a function which returns a structure actually writes it to an area pointed to by a hidden parameter
  • Integer Division --- certain fornulas can be used to evaluate integer division more efficiently in the case of a known divisor
  • Floating Conditions --- a floating point condition is turned into a complex sequence of instructions setting and querying the floating point status
  • Complex Math --- a complex multiplication or division is turned into a library call
  • Native routines --- a memcpy(), memset(), strcmp() or strlen() operation is transformed into rep mov, rep sto, rep zcmp, or rep zscas
  • Short Circuiting --- a complex condition is is evaluated in a tree of basic blocks
  • Union Ambiguation --- information is lost regarding which member of a union is intended
  • Copy Fragmentation --- large double or aggregate values are copied word by word
  • Test Fragmentation --- a condition on a long integer value is composed of separate tests on the individual words of that value
  • Switch Fragmentation --- a switch statement is replaced by nest of conditions on a value
  • Loop Header Copy --- a loop is augmented with a condition which decides whether to enter the loop
  • Loop Unrolling --- a loop is replaced by replicated copies of the loop body
  • Function Inlining --- a function call is replaced by a copy of the body of the function
like image 194
Robin Rodricks Avatar answered Jan 20 '23 17:01

Robin Rodricks