Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fastest integer type for common architectures

The stdint.h header lacks an int_fastest_t and uint_fastest_t to correspond with the {,u}int_fastX_t types. For instances where the width of the integer type does not matter, how does one pick the integer type that allows processing the greatest quantity of bits with the least penalty to performance? For example, if one was searching for the first set bit in a buffer using a naive approach, a loop such as this might be considered:

// return the bit offset of the first 1 bit
size_t find_first_bit_set(void const *const buf)
{
    uint_fastest_t const *p = buf; // use the fastest type for comparison to zero
    for (; *p == 0; ++p); // inc p while no bits are set
    // return offset of first bit set
    return (p - buf) * sizeof(*p) * CHAR_BIT + ffsX(*p) - 1;
}

Naturally, using char would result in more operations than int. But long long might result in more expensive operations than the overhead of using int on a 32 bit system and so on.

My current assumption is for the mainstream architectures, the use of long is the safest bet: It's 32 bit on 32 bit systems, and 64 bit on 64 bit systems.

like image 424
Matt Joiner Avatar asked Sep 12 '10 07:09

Matt Joiner


People also ask

Is int faster than short?

In most cases using int in a loop is more efficient than using short. My simple tests showed a performance gain of ~10% when using int.

Is an int 16 or 32 bit?

The size of a signed int or unsigned int item is the standard size of an integer on a particular machine. For example, in 16-bit operating systems, the int type is usually 16 bits, or 2 bytes. In 32-bit operating systems, the int type is usually 32 bits, or 4 bytes.

What is uint_ fast8_ t?

uint_fast8_t means it's the fastest unsigned int with at least 8 bits. uint_least8_t means it's an unsigned int with at least 8 bits.

Are ints always 32 bits?

int is always 32 bits wide. sizeof(T) represents the number of 8-bit bytes (octets) needed to store a variable of type T .


2 Answers

int_fast8_t is always the fastest integer type in a correct implementation. There can never be integer types smaller than 8 bits (because CHAR_BIT>=8 is required), and since int_fast8_t is the fastest integer type with at least 8 bits, it's thus the fastest integer type, period.

like image 93
R.. GitHub STOP HELPING ICE Avatar answered Oct 25 '22 05:10

R.. GitHub STOP HELPING ICE


Theoretically, int is the best bet. It should map to the CPU's native register size, and thus be "optimal" in the sense you're asking about.

However, you may still find that an int-64 or int-128 is faster on some CPUs than an int-32, because although these are larger than the register size, they will reduce the number of iterations of your loop, and thus may work out more efficient by minimising the loop overheads and/or taking advantage of DMA to load/store the data faster.

(For example, on ARM-2 processors it took 4 memory cycles to load one 32-bit register, but only 5 cycles to load two sequentially, and 7 cycles to load 4 sequentially. The routine you suggest above would be optimised to use as many registers as you could free up (8 to 10 usually), and could therefore run up to 3 or 4 times faster by using multiple registers per loop iteration)

The only way to be sure is to write several routines and then profile them on the specific target machine to find out which produces the best performance.

like image 21
Jason Williams Avatar answered Oct 25 '22 05:10

Jason Williams