Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Would a C compiler be allowed to replace an algorithm with another?

Tags:

c

optimization

For example you have a function sort(int* numbers, size_t count) with a bubblesort implementation and a C compiler recognizes this pattern. Would the compiler be allowed to change it to another example? Like Quicksort.

Another example would be adding all numbers from 0 to n, where the compiler could replace the for loop with (n*(n-1))/2.

like image 891
JCWasmx86 Avatar asked Dec 07 '22 10:12

JCWasmx86


1 Answers

If the called function is within the optimization domain, which there are ways to do even if the function is another file then yes. In the same way that having the called function in the same file. But that doesn't mean that the compiler assumes based on the name of a function what the called code really is, you can easily make your own C library and link it in and have the named functions (qsort for example) do whatever you want. So this is undesireable, yet the tools do it:

#include <stdio.h>
void fun ( void )
{
    printf("0");
}
Disassembly of section .text:

00000000 <fun>:
   0:   e3a00030    mov r0, #48 ; 0x30
   4:   eafffffe    b   0 <putchar>

The compiler has replaced the desired function call with some other function call.

So yes, can and will do this if the optimizer is programmed to. If it has visibility it is more likely to optimize the two parts as a whole:

unsigned int more_fun ( unsigned int a, unsigned int b )
{
    return(a+b);
}
unsigned int fun ( void )
{
    return(more_fun(1,2));
}

Disassembly of section .text:

00000000 <more_fun>:
   0:   e0800001    add r0, r0, r1
   4:   e12fff1e    bx  lr

00000008 <fun>:
   8:   e3a00003    mov r0, #3
   c:   e12fff1e    bx  lr

which is not replacing one function with another like the printf example but it is replacing the optimized function with a possibly more optimized inline solution.

And yes certainly some compilers will optimize out a dead-ish loop:

unsigned int fun ( void )
{
    unsigned int ra;
    for(ra=0;ra<10;ra++)
    {
    }
    return(ra);
}

00000000 <fun>:
   0:   e3a0000a    mov r0, #10
   4:   e12fff1e    bx  lr

Granted this is a trivial one but I have generated pseudo randomizers that were dead code like this, depends on the compiler/optimizer and the code and if it has been programmed to it will reduce the loop/code into something simpler.

Or perhaps this is what you were asking:

unsigned int more_fun ( unsigned int n )
{
    return((n*(n-1))/2);
}
unsigned int fun ( void )
{
    return(more_fun(10));
}

00000000 <more_fun>:
   0:   e2403001    sub r3, r0, #1
   4:   e0020093    mul r2, r3, r0
   8:   e1a000a2    lsr r0, r2, #1
   c:   e12fff1e    bx  lr

00000010 <fun>:
  10:   e3a0002d    mov r0, #45 ; 0x2d
  14:   e12fff1e    bx  lr

The math is removed where possible.

You can't assume a particular compiler will do this, nor expect two compilers to produce the same code/optimization. But CAN and WILL a compiler do this, yes, if it is programmed to and has visibility into all the code it needs to optimize.

As far as "allowed", the compilers job is to functionally replace one language with another, which is in part why any two compilers that translate from one language to another (same input language to same output target) can and will produce different output code, its a functional replacement there isn't a single right answer for the output of a compile, it can vary. So where possible per the language, as shown above the compiler generates functionally equivalent code, that's its job so it is allowed to do that job. Although I would not consider putchar as a printf replacement as the compiler didnt look at my printf code, but at the same time I assume this compiler (gcc) has a command line option to not permit that. Likewise not uncommon for compilers to replace say a struct assignment, two structs of the same type a = b; with a memcpy, but at the same time have a command line option indicating I don't want to you to replace the code I asked for with a memcpy, I wanted it done discretely.

Between gcc and clang you will see these behaviors so as far as the word "allowed" goes, I guess it is allowed. As shown you can examine the compilers output and see for your use case, what the compiler did.

like image 65
old_timer Avatar answered Mar 11 '23 02:03

old_timer