I've heard many conflicting information on the subject, but in general what I heard that the Int type was supposed to be related to the platform word size, so for example in a 32bit-machine Int has 4 bytes.
Except when I started coding in my childhood on DOS, I think my compiler already used 32bit Int even if the target was a 16-bit processor (like the 286) requiring constant use of shorts...
And today I compiled a program of mine as 64bit just for kicks, and the Int still ended being 32bit (and short 16bit, I don't tested long).
I know the C standard defines this: short <= int <= long
yet I am curious, what happened? Why everyone decided to use some arbitrary sizes for int?
Yes, it depends on both processors (more specifically, ISA, instruction set architecture, e.g., x86 and x86-64) and compilers including programming model. For example, in 16-bit machines, sizeof (int) was 2 bytes. 32-bit machines have 4 bytes for int .
The compiler is ultimately responsible for this. stackoverflow.com/questions/2331751/… is probably a better dupe.
int is 32 bits in size. long , ptr , and off_t are all 64 bits (8 bytes) in size. The 32-bit data model for z/OS® XL C/C++ compilers is ILP32 plus long long.
C99 Rationale uses lots of words to explain why long long
was introduced, I'm quoting the part about the history of integer types, I think it can answer your question, at least partly.
Rationale for International Standard — Programming Languages — C §6.2.5 Types
In the 1970s, 16-bit C (for the PDP-11) first represented file information with 16-bit integers, which were rapidly obsoleted by disk progress. People switched to a 32-bit file system, first using
int[2]
constructs which were not only awkward, but also not efficiently portable to 32-bit hardware.To solve the problem, the
long
type was added to the language, even though this required C on the PDP-11 to generate multiple operations to simulate 32-bit arithmetic. Even as 32-bit minicomputers became available alongside 16-bit systems, people still used int for efficiency, reservinglong
for cases where larger integers were truly needed, sincelong
was noticeably less efficient on 16-bit systems. Bothshort
andlong
were added to C, makingshort
available for 16 bits,long
for 32 bits, andint
as convenient for performance. There was no desire to lock the numbers 16 or 32 into the language, as there existed C compilers for at least 24- and 36-bit CPUs, but rather to provide names that could be used for 32 bits as needed.PDP-11 C might have been re-implemented with
int
as 32-bits, thus avoiding the need forlong
; but that would have made people change most uses ofint
toshort
or suffer serious performance degradation on PDP-11s. In addition to the potential impact on source code, the impact on existing object code and data files would have been worse, even in 1976. By the 1990s, with an immense installed base of software, and with widespread use of dynamic linked libraries, the impact of changing the size of a common data object in an existing environment is so high that few people would tolerate it, although it might be acceptable when creating a new environment. Hence, many vendors, to avoid namespace conflicts, have added a 64-bit integer to their 32-bit C environments using a new name, of whichlong long
has been the most widely used.
This was true in the olden days, back when the memory bus size had the same width as the processor register size. But that stopped being true a while ago already, the Pentium was the first processor you'd find on standard hardware where the memory bus size got bigger, 64-bits for a 32-bit processor. A simple way to improve the bus throughput.
Memory is a very significant bottle-neck, it is much slower than the execution core of the processor. A problem related to distance, the further an electric signal has to travel, the more difficult it gets to switch the signal at a high frequency without the signal getting corrupted.
Accordingly, the sizes of the processor caches, as well as the efficiency with which the program can use them, heavily determines the program execution speed. A cache miss can easily cost a fat hundred cpu cycles.
Your 64-bit processor did not get double the cache size, L1 is still 32KB instruction and 32KB data whether your program executes in 32-bit or 64-bit mode. The available space on the chip, and most importantly, the distance between the cache and the execution engine are physical constraints, determined by the feature size of the process technology.
So making an int 64-bits, while very simple to do by the compiler, it very detrimental to program speed. Such a program uses the caches much less effectively and will suffer from many more stalls while waiting for the memory bus.
Dominant data models for 64-bit are LLP64, the choice made by Microsoft, and LP64, the choice made on *nix operating systems. Both use 32-bits for int, LLP64 uses 32-bit for long, LP64 makes it 64-bits. A long long is 64-bits on both.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With