Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is a union more efficient than a shift on modern compilers?

Consider the simple code:

UINT64 result;
UINT32 high, low;
...
result = ((UINT64)high << 32) | (UINT64)low;

Do modern compilers turn that into a real barrel shift on high, or optimize it to a simple copy to the right location?

If not, then using a union would seem to be more efficient than the shift that most people appear to use. However, having the compiler optimize this is the ideal solution.

I'm wondering how I should advise people when they do require that extra little bit of performance.

like image 792
Adam Davis Avatar asked May 25 '11 18:05

Adam Davis


2 Answers

I wrote the following (hopefully valid) test:

#include <stdio.h>
#include <stdint.h>
#include <stdlib.h>

void func(uint64_t x);

int main(int argc, char **argv)
{
#ifdef UNION
  union {
    uint64_t full;
    struct {
      uint32_t low;
      uint32_t high;
    } p;
  } result;
  #define value result.full
#else
  uint64_t result;
  #define value result
#endif
  uint32_t high, low;

  if (argc < 3) return 0;

  high = atoi(argv[1]);
  low = atoi(argv[2]);

#ifdef UNION
  result.p.high = high;
  result.p.low = low;
#else
  result = ((uint64_t) high << 32) | low;
#endif

  // printf("%08x%08x\n", (uint32_t) (value >> 32), (uint32_t) (value & 0xffffffff));
  func(value);

  return 0;
}

Running a diff of the unoptimized output of gcc -s:

<   mov -4(%rbp), %eax
<   movq    %rax, %rdx
<   salq    $32, %rdx
<   mov -8(%rbp), %eax
<   orq %rdx, %rax
<   movq    %rax, -16(%rbp)
---
>   movl    -4(%rbp), %eax
>   movl    %eax, -12(%rbp)
>   movl    -8(%rbp), %eax
>   movl    %eax, -16(%rbp)

I don't know assembly, so it's hard for me to analyze that. However, it looks like some shifting is taking place as expected on the non-union (top) version.

But with optimizations -O2 enabled, the output was identical. So the same code was generated and both ways will have the same performance.

(gcc version 4.5.2 on Linux/AMD64)

Partial output of optimized -O2 code with or without union:

    movq    8(%rsi), %rdi
    movl    $10, %edx
    xorl    %esi, %esi
    call    strtol

    movq    16(%rbx), %rdi
    movq    %rax, %rbp
    movl    $10, %edx
    xorl    %esi, %esi
    call    strtol

    movq    %rbp, %rdi
    mov     %eax, %eax
    salq    $32, %rdi
    orq     %rax, %rdi
    call    func

The snippet begins immediately after the jump generated by the if line.

like image 158
Matthew Avatar answered Sep 30 '22 14:09

Matthew


Modern compilers are smarter than what you might think ;-) (so yes, I think you can expect a barrel shift on any decent compiler).

Anyway, I would use the option that has a semantic closer to what you are actually trying to do.

like image 30
fortran Avatar answered Sep 30 '22 13:09

fortran