Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Wasn't Int type supposed to be the size of the platform word size?

Tags:

c

I've heard many conflicting information on the subject, but in general what I heard that the Int type was supposed to be related to the platform word size, so for example in a 32bit-machine Int has 4 bytes.

Except when I started coding in my childhood on DOS, I think my compiler already used 32bit Int even if the target was a 16-bit processor (like the 286) requiring constant use of shorts...

And today I compiled a program of mine as 64bit just for kicks, and the Int still ended being 32bit (and short 16bit, I don't tested long).

I know the C standard defines this: short <= int <= long yet I am curious, what happened? Why everyone decided to use some arbitrary sizes for int?

like image 745
speeder Avatar asked Feb 01 '14 17:02

speeder


People also ask

Does size of integer depends on operating system?

Yes, it depends on both processors (more specifically, ISA, instruction set architecture, e.g., x86 and x86-64) and compilers including programming model. For example, in 16-bit machines, sizeof (int) was 2 bytes. 32-bit machines have 4 bytes for int .

Who decides the size of integer?

The compiler is ultimately responsible for this. stackoverflow.com/questions/2331751/… is probably a better dupe.

What is the size of int in compiler?

int is 32 bits in size. long , ptr , and off_t are all 64 bits (8 bytes) in size. The 32-bit data model for z/OS® XL C/C++ compilers is ILP32 plus long long.


2 Answers

C99 Rationale uses lots of words to explain why long long was introduced, I'm quoting the part about the history of integer types, I think it can answer your question, at least partly.

Rationale for International Standard — Programming Languages — C §6.2.5 Types

In the 1970s, 16-bit C (for the PDP-11) first represented file information with 16-bit integers, which were rapidly obsoleted by disk progress. People switched to a 32-bit file system, first using int[2] constructs which were not only awkward, but also not efficiently portable to 32-bit hardware.

To solve the problem, the long type was added to the language, even though this required C on the PDP-11 to generate multiple operations to simulate 32-bit arithmetic. Even as 32-bit minicomputers became available alongside 16-bit systems, people still used int for efficiency, reserving long for cases where larger integers were truly needed, since long was noticeably less efficient on 16-bit systems. Both short and long were added to C, making short available for 16 bits, long for 32 bits, and int as convenient for performance. There was no desire to lock the numbers 16 or 32 into the language, as there existed C compilers for at least 24- and 36-bit CPUs, but rather to provide names that could be used for 32 bits as needed.

PDP-11 C might have been re-implemented with int as 32-bits, thus avoiding the need for long; but that would have made people change most uses of int to short or suffer serious performance degradation on PDP-11s. In addition to the potential impact on source code, the impact on existing object code and data files would have been worse, even in 1976. By the 1990s, with an immense installed base of software, and with widespread use of dynamic linked libraries, the impact of changing the size of a common data object in an existing environment is so high that few people would tolerate it, although it might be acceptable when creating a new environment. Hence, many vendors, to avoid namespace conflicts, have added a 64-bit integer to their 32-bit C environments using a new name, of which long long has been the most widely used.

like image 82
Yu Hao Avatar answered Sep 22 '22 03:09

Yu Hao


This was true in the olden days, back when the memory bus size had the same width as the processor register size. But that stopped being true a while ago already, the Pentium was the first processor you'd find on standard hardware where the memory bus size got bigger, 64-bits for a 32-bit processor. A simple way to improve the bus throughput.

Memory is a very significant bottle-neck, it is much slower than the execution core of the processor. A problem related to distance, the further an electric signal has to travel, the more difficult it gets to switch the signal at a high frequency without the signal getting corrupted.

Accordingly, the sizes of the processor caches, as well as the efficiency with which the program can use them, heavily determines the program execution speed. A cache miss can easily cost a fat hundred cpu cycles.

Your 64-bit processor did not get double the cache size, L1 is still 32KB instruction and 32KB data whether your program executes in 32-bit or 64-bit mode. The available space on the chip, and most importantly, the distance between the cache and the execution engine are physical constraints, determined by the feature size of the process technology.

So making an int 64-bits, while very simple to do by the compiler, it very detrimental to program speed. Such a program uses the caches much less effectively and will suffer from many more stalls while waiting for the memory bus.

Dominant data models for 64-bit are LLP64, the choice made by Microsoft, and LP64, the choice made on *nix operating systems. Both use 32-bits for int, LLP64 uses 32-bit for long, LP64 makes it 64-bits. A long long is 64-bits on both.

like image 24
Hans Passant Avatar answered Sep 25 '22 03:09

Hans Passant