Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why doesn't GCC optimize this call to printf?

#include <stdio.h>
int main(void) { 
    int i;
    scanf("%d", &i);
    if(i != 30) { return(0); } 
    printf("i is equal to %d\n", i);
}

It appears that the resulting string will always be "i is equal to 30", so, why doesn't GCC optimize this call to printf with a call to puts(), or write(), for example?

(Just checked the generated assembly, with gcc -O3 (version 5.3.1), or on the Godbolt Compiler Explorer)

like image 328
Mitsos101 Avatar asked May 25 '16 11:05

Mitsos101


1 Answers

First of all, the problem is not the if; as you saw, gcc sees through the if and manages to pass 30 straight to printf.

Now, gcc does have some logic to handle special cases of printf (in particular, it does optimize printf("something\n") and even printf("%s\n", "something") to puts("something")), but it is extremely specific and doesn't go much further; printf("Hello %s\n", "world"), for example, is left as-is. Even worse, any of the variants above without a trailing newline are left untouched, even if they could be transformed to fputs("something", stdout).

I imagine that this comes down to two main problems:

  • the two cases above are extremely easy patterns to implement and happen quite frequently, but for the rest probably it's rarely worth the effort; if the string is constant and the performance is important, the programmer can take care of it easily - actually, if the performance of printf is critical he shouldn't be relying on this kind of optimization, which may break at the slightest change of format string.

    If you ask me, even just the puts optimizations above are already "going for the style points": you are not really going to gain serious performance in anything but artificial test cases.

  • When you start to go outside the realm of %s\n, printf is a minefield, because it has a strong dependency on the runtime environment; in particular, many printf specifiers are (unfortunately) affected by the locale, plus there are a host of implementation-specific quirks and specifiers (and gcc can work with printf from glibc, musl, mingw/msvcrt, ... - and at compile time you cannot invoke the target C runtime - think when you are cross-compiling).

    I agree that this simple %d case is probably safe, but I can see why they probably decided to avoid being overly smart and only perform the dumbest and safest optimizations here.


For the curious reader, here is where this optimization is actually implemented; as you can see, the function matches a restricted number of very simple cases (and GIMPLE aside, hasn't changed a lot since this nice article outlining them was written). Incidentally, the source actually explains why they couldn't implement the fputs variant for the non-newline case (there's no easy way to reference the stdout global at that compilation stage).

like image 72
Matteo Italia Avatar answered Oct 02 '22 15:10

Matteo Italia