Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Should I truncate my ints to shorts before computing bitwise ops on them?

Suppose I have a bitwise expression f: int -> int that is dependent only on the two lowest bytes of the input. In general, would it be any faster to compute (uint32)f'((uint16)x), where f': short -> short? Or is there overhead involved in the conversion? Assume f is too complicated for the compiler to figure this out on its own.

like image 762
Elliot Gorokhovsky Avatar asked Jun 09 '17 00:06

Elliot Gorokhovsky


People also ask

What is uint_fast32_t?

It's an optional typedef that the implementation must provide iff it has an unsigned integer type of exactly 32-bits. Some have a 9-bit bytes for example, so they don't have a uint32_t . uint_fast32_t states your intent clearly: it's a type of at least 32 bits which is the best from a performance point-of-view.

How many bits is a long?

The size of the long type is 8 bytes (64 bits).

What is fast integer?

The fast type (int_fast#_t) gives you an integer that's the fastest type with a width of at least # bits (where # = 8, 16, 32, or 64). For example, int_fast32_t will give you the fastest integer type that's at least 32 bits.


1 Answers

Shorter integers are not faster. Generally speaking, the fastest types have the same size as a native CPU word (so use 32-bit integers on x86, and 64-bit integers on AMD64/x64). Additionally, unsigned integers are faster than signed integers for certain operations ( performance of unsigned vs signed integers ).

Non-word-sized integers are slower than word-sized integers because the processor hardware has to provide additional padding when loading the value, and then truncate it when storing it; there can also be alignment issues (mainly when the ISA allows non-aligned value storage - though even word-sized integers can be non-aligned too).

Recent versions of C come with typdefs of known fast types, with their names indicating the maximum sized values they can accomodate, for example int_fast8_t for the fastest type to perform octet operations on (even though it might even be a 128-bit type when compiled).

In your question, you wrote you're only performing operations on 16-bit values ( "the two lowest bytes of the input"), which means you'll want to use uint_fast16_t. You will want uint (for unsigned) because the lower 16-bits of any integer (even signed integers, like int) represent an unsigned value.

These "fast integer" types are defined in the stdint.h which comes with your compiler, it can be included directly with #include <stdint.h> or indirectly via #include <inttypes.h>. stdint.h will also include limits.h.

More information can be seen in the Wikibook on C:

https://en.wikibooks.org/wiki/C_Programming/stdint.h

stdint.h is a header file in the C standard library introduced in the C99 standard library section 7.18 to allow programmers to write more portable code by providing a set of typedefs that specify exact-width integer types, together with the defined minimum and maximum allowable values for each type, using macros

https://en.wikibooks.org/wiki/C_Programming/inttypes.h

Fast & fixed integer types | signed         | unsigned
---------------------------+----------------+---------------
 8 bit                     | int_fast8_t    | uint_fast8_t
16 bit                     | int_fast16_t   | uint_fast16_t
32 bit                     | int_fast32_t   | uint_fast32_t
64 bit                     | int_fast64_t   | uint_fast64_t

C++

As you can see, these definitions are only mandated by C99 (not C90, nor C++03), however C++11 does improve C90 compatibility by incorporating stdint.h as <cstdint> (i.e. #include <cstdint>).

Microsoft Visual Studio's C++ compiler and toolchain (as of Visual Studio 2017) is not a conforming C99 compiler (this is by-design), however as Visual C++ 2012 and higher is a conforming C++11 compiler you can use <cstdint> as-intended - it is only Visual Studio 2010 and older (2008, 2005, etc) that lack the header.

I note that Microsoft endorses the use of the Clang toolchain with Visual Studio if you wish to compile C99 code: https://blogs.msdn.microsoft.com/vcblog/2015/12/04/clang-with-microsoft-codegen-in-vs-2015-update-1/

Java:

If you're using Java then the size of most primitive types is strictly defined (as opposed to C and C++ where they only need to be capable of storing values from within a certain minimum range as it's valid for a C compiler to use a 128-bit native integer for ushort): https://docs.oracle.com/javase/tutorial/java/nutsandbolts/datatypes.html - I don't believe there is any way to achieve "fast" native integer arithmetic in pure Java.

C# / .NET:

In C# / .NET the story is similar to Java (as System.Byte, Int16, etc) are strictly defined, and in C# the short, int and long keywords are always aliases for Int16, Int32 and Int64 unlike in C where their definition is vendor-defined.

.NET does have the platform-sized System.IntPtr type, but you cannot perform bitwise or arithmetic operations on it besides addition and subtraction (and the overhead of using the type in the first place would wipe-out any performance gains from using this type - though I note IntPtr is not necessarily word-sized either anyway: The sizeof(void*) does not have to equal sizeof(uint_fast32_t) (because pointers must be large enough to store every possible valid address, yet the native word-size could be smaller, such as a CPU with a 32-bit word size, but with a 64-bit address bus, in which case sizeof(void*) == 8 but sizeof(uint_fast32_t) == 4 - and the reverse is possible: a 64-bit word machine with only a 32-bit address bus (sizeof(void*) == 4, sizeof(uint_fast32_t) == 8).

Python, JavaScript, Ruby, PHP

These are the other 4 top languages on GitHub - they all share in common a weakly-typed design and so do not directly expose any "fast int" functionality or typing system (JavaScript doesn't even let you differentiate between float and int with its Number type)... but if you want performance you wouldn't be using these languages :)

Further reading:

  • What is uint_fast32_t and why should it be used instead of the regular int and uint32_t?
  • Exotic architectures the standards committees care about
  • What's the difference between "int" and "int_fast16_t"?

In practice:

In typical desktop software development targeting x86 and x64 you can take many things for granted, but in more exotic environments things tend to be different: consider the MIPS/NEC VR3400 (similar to the VR3400i used in the Nintendo 64), it's capable of native 64-bit operations (i.e. it has a native 64-bit word size), yet 32-bit operations are actually faster than 64-bit operations, and it supports both 32 and 40-bit address spaces - which means that had stdint.h existed at the time (this was 1995, predating C99) the definitions for "least", "fast", and pointer integer types would be very different to x86.

like image 147
Dai Avatar answered Oct 08 '22 20:10

Dai