Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I get GCC to optimize this bit-shifting instruction into a move?

I'm trying to use the following code to emulate a 16-bit half-float in software:

typedef struct half
{
    unsigned short mantissa:10;
    unsigned short exponent:5;
    unsigned short sign:1;
} half;

unsigned short from_half(half h)
{
    return h.mantissa | h.exponent << 10 | h.sign << 15;
}

half to_half(unsigned short s)
{
    half result = { s, s >> 10, s >> 15 };
    return result;
}

I set this up so that it could easily be optimized into a move instruction, but lo and behold, in from_half, GCC does the bit-shifting anyway (even at -O3):

from_half:
        mov     edx, edi
        mov     eax, edi
        and     di, 1023
        shr     dx, 15
        and     eax, 31744
        movzx   edx, dl
        sal     edx, 15
        or      eax, edx
        or      eax, edi
        ret

while to_half is optimized nicely:

to_half:
        mov     eax, edi
        ret

Godbolt

I've tried different optimization levels (-O1, -O2, -Os) but none optimize it into what I was hoping.

Clang does this how I would expect even at -O1:

from_half:                              # @from_half
        mov     eax, edi
        ret
to_half:                                # @to_half
        mov     eax, edi
        ret

Godbolt

How can I get GCC to optimize this into a move? Why isn't it optimized that way already?

like image 841
S.S. Anne Avatar asked Mar 07 '20 17:03

S.S. Anne


People also ask

How do I use optimization in gcc?

GCC has a range of optimization levels, plus individual options to enable or disable particular optimizations. The overall compiler optimization level is controlled by the command line option -On, where n is the required optimization level, as follows: -O0 . (default).

What is gcc optimize?

The compiler optimizes to reduce the size of the binary instead of execution speed. If you do not specify an optimization option, gcc attempts to reduce the compilation time and to make debugging always yield the result expected from reading the source code.

Is gcc an optimizing compiler?

GCC performs nearly all supported optimizations that do not involve a space-speed tradeoff. As compared to -O , this option increases both compilation time and the performance of the generated code.

How do I know if gcc is not optimized?

Compiler specific pragma gcc provides pragma GCC as a way to control temporarily the compiler behavior. By using pragma GCC optimize("O0") , the optimization level can be set to zero, which means absolutely no optimize for gcc.


Video Answer


1 Answers

In addition to Booboo's answer, you can try the following which answers your question

How can I get GCC to optimize this into a move?

Just cast each shifted bit-field expression to unsigned short

unsigned short from_half(half h)
{
    return (unsigned short)h.mantissa | (unsigned short)(h.exponent << 10) | (unsigned short)(h.sign << 15);
}

https://godbolt.org/z/CfZSgC

It results in:

from_half:
        mov     eax, edi
        ret

Why isn't it optimized that way already?

I am not sure I have a solid answer on this one. Apparently the intermediate promotion of the bit-fields to int confuses the optimizer... But this is just a guess.

like image 103
Alex Lop. Avatar answered Oct 31 '22 15:10

Alex Lop.