Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why clang does not optimize global const like a #define?

I have this test program, using a #define constant:

#include <stdio.h>

#define FOO 1

int main()
{
    printf("%d\n", FOO);

    return 0;
}

When compiled with “Apple LLVM version 10.0.0 (clang-1000.11.45.5)”, I get an executable of 8432 bytes. Here is the assembly listing:

    .section    __TEXT,__text,regular,pure_instructions
    .build_version macos, 10, 14
    .globl  _main                   ## -- Begin function main
    .p2align    4, 0x90
_main:                                  ## @main
    .cfi_startproc
## %bb.0:
    pushq   %rbp
    .cfi_def_cfa_offset 16
    .cfi_offset %rbp, -16
    movq    %rsp, %rbp
    .cfi_def_cfa_register %rbp
    subq    $16, %rsp
    leaq    L_.str(%rip), %rdi
    movl    $1, %esi
    movl    $0, -4(%rbp)
    movb    $0, %al
    callq   _printf
    xorl    %esi, %esi
    movl    %eax, -8(%rbp)          ## 4-byte Spill
    movl    %esi, %eax
    addq    $16, %rsp
    popq    %rbp
    retq
    .cfi_endproc
                                        ## -- End function
    .section    __TEXT,__cstring,cstring_literals
L_.str:                                 ## @.str
    .asciz  "%d\n"


.subsections_via_symbols

Now I replace #define FOO 1 with const int FOO = 1;. The executable is now 8464 bytes and the assembly listing looks like this:

.section    __TEXT,__text,regular,pure_instructions
    .build_version macos, 10, 14
    .globl  _main                   ## -- Begin function main
    .p2align    4, 0x90
_main:                                  ## @main
    .cfi_startproc
## %bb.0:
    pushq   %rbp
    .cfi_def_cfa_offset 16
    .cfi_offset %rbp, -16
    movq    %rsp, %rbp
    .cfi_def_cfa_register %rbp
    subq    $16, %rsp
    leaq    L_.str(%rip), %rdi
    movl    $1, %esi
    movl    $0, -4(%rbp)
    movb    $0, %al
    callq   _printf
    xorl    %esi, %esi
    movl    %eax, -8(%rbp)          ## 4-byte Spill
    movl    %esi, %eax
    addq    $16, %rsp
    popq    %rbp
    retq
    .cfi_endproc
                                        ## -- End function
    .section    __TEXT,__const
    .globl  _FOO                    ## @FOO
    .p2align    2
_FOO:
    .long   1                       ## 0x1

    .section    __TEXT,__cstring,cstring_literals
L_.str:                                 ## @.str
    .asciz  "%d\n"


.subsections_via_symbols

So it actually declared a FOO variable, making the executable 32 bytes bigger. I get the same result with -O3 optimization level.

Why is that? Normally, the compiler should be intelligent enough to optimize and add the constant to the symbol table instead of taking up storage for it.

like image 386
GilDev Avatar asked Jan 18 '19 14:01

GilDev


2 Answers

This is another case where the difference between C and C++ matters.

In C, const int FOO has external linkage and must thus be included in the binary.

Compiling with g++ or clang++ instead gives you the desired optimization as FOO has internal linkage in C++.

You can achieve the optimization in C mode by explicitly requesting internal linkage for FOO via

static const int FOO = 1;

Both clang and gcc with link-time optimization enabled (-flto) also manage to strip away the unused symbol, even when linkage is external. (Live with and without LTO.)

like image 114
Baum mit Augen Avatar answered Sep 22 '22 02:09

Baum mit Augen


The fact that you use the variable FOO in your second program means that it has to live somewhere, so the compiler needs to allocate it somewhere.

In the #define case, there is no variable - the pre-processor substituted the text "FOO" with the text "1" an so the call to printf() was passed a constant value, not a variable.

like image 40
dan.m was user2321368 Avatar answered Sep 21 '22 02:09

dan.m was user2321368