Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How many number of bits in a byte for GCC compiler

Tags:

c++

gcc

As per the C++ spec

A byte is at least large enough to contain any member of the basic execution character set (2.3) and the eight-bit code units of the Unicode UTF-8 encoding form and is composed of a contiguous sequence of bits, the number of which is implementation defined.

That means, the number of bits in a byte must be either 8 bits or more than 8 bits.

Now, as per gcc, the number of bits are determined by ABI.

https://gcc.gnu.org/onlinedocs/gcc-5.4.0/gcc/Characters-implementation.html#Characters-implementation

4.4 Characters

The number of bits in a byte (C90 3.4, C99 and C11 3.6).

Determined by ABI

GCC is based on ABI - http://itanium-cxx-abi.github.io/cxx-abi/

Can anyone point me to the location where the number of bits in a byte is mentioned?

like image 978
Ujjwal Avatar asked Jan 02 '23 03:01

Ujjwal


2 Answers

The C++ standard (and therefore most compilers) effectively only guarantee that a char is at least 8 contiguous bits. For any particular compilation, the actual number of bits depends on the target CPU architecture.

However, you'll have to try quite hard to find a target CPU that doesn't have 8 bit bytes in most cases.

If you have to write code that depends on the assumption of an 8-bit byte, then you can always static_assert(CHAR_BIT == 8) to prevent any compilation that violates your assumption.

like image 66
JMAA Avatar answered Jan 10 '23 12:01

JMAA


Can anyone point me to the location where the number of bits in a byte is mentioned?

Pedantically, it isn't. That particular ABI uses "byte" in place of "octet" throughout; in the modern era, "byte" is a common synonym for "octet" because the vast majority of systems used have 8-bit bytes.

It does say this:

In general, this document is written as a generic specification, to be usable by C++ implementations on a variety of architectures. However, it does contain processor-specific material for the Itanium 64-bit ABI, identified as such. Where structured data layout is described, we generally assume Itanium psABI member sizes.

…and the Itanium chips all have 8-bit bytes.

If you're using some other chip, and it has a different number of bits per byte, and you found a compiler that targets said chip, then you have your alternative answer. (But it doesn't, and you didn't.)

There's not really any room for interpretation here, even if the relationship between bits and bytes is not outright stated.

I will very occasionally write a static_assert(CHAR_BIT == 8) if I'm feeling particularly paranoid. Overall you can rely on this unless you're targeting something really exotic.

like image 27
Lightness Races in Orbit Avatar answered Jan 10 '23 12:01

Lightness Races in Orbit