Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Who decides the sizeof any datatype or structure (depending on 32 bit or 64 bit)?

Who decides the sizeof any datatype or structure (depending on 32 bit or 64 bit)? The compiler or the processor? For example, sizeof(int) is 4 bytes for a 32 bit system whereas it's 8 bytes for a 64 bit system.

I also read that sizeof(int) is 4 bytes when compiled using both 32-bit and 64-bit compiler.

Suppose my CPU can run both 32-bit as well as 64-bit applications, who will play main role in deciding size of data the compiler or the processor?

like image 655
Vishal-Lia Avatar asked Feb 22 '18 06:02

Vishal-Lia


People also ask

What determines 32bit vs 64bit?

Look for the System Type option under the Item column on the right side of System Information. In the Value column, the associated value tells you which type of CPU the computer has in it. If the System Type value includes "x86" in it, the CPU is 32-bit. If the System Type value includes "x64" in it, the CPU is 64-bit.

Does 32-bit and 64-bit matter?

What Are 32-Bit and 64-Bit? When it comes to computers, the difference between 32-bit and a 64-bit is all about processing power. Computers with 32-bit processors are older, slower, and less secure, while a 64-bit processor is newer, faster, and more secure.


4 Answers

It's ultimately the compiler. The compiler implementors can decide to emulate whatever integer size they see fit, regardless of what the CPU handles the most efficiently. That said, the C (and C++) standard is written such, that the compiler implementor is free to choose the fastest and most efficient way. For many compilers, the implementers chose to keep int as a 32 bit, although the CPU natively handles 64 bit ints very efficiently.

I think this was done in part to increase portability towards programs written when 32 bit machines were the most common and who expected an int to be 32 bits and no longer. (It could also be, as user user3386109 points out, that 32 bit data was preferred because it takes less space and therefore can be accessed faster.)

So if you want to make sure you get 64 bit ints, you use int64_t instead of int to declare your variable. If you know your value will fit inside of 32 bits or you don't care about size, you use int to let the compiler pick the most efficient representation.

As for the other datatypes such as struct, they are composed from the base types such as int.

like image 176
Prof. Falken Avatar answered Oct 14 '22 06:10

Prof. Falken


It's not the CPU, nor the compiler, nor the operating system. It's all three at the same time.

The compiler can't just make things up. It has to adhere to the right ABI[1] that that the operating system provides. If structs and system calls provided by the operating system have types with certain sizes and alignment requirements the compiler isn't really free to make up its own reality unless the compiler developers want to reimplement wrapper functions for everything the operating system provides. Then the ABI of the operating system can't just be completely made up, it has to do what can be reasonably done on the CPU. And very often the ABI of one operating system will be very similar to other ABIs for other operating systems on the same CPU because it's easier to just be able to reuse the work they did (on compilers among other things).

In case of computers that support both 32 bit and 64 bit code there still needs to be work done by the operating system to support running programs in both modes (because the system has to provide two different ABIs). Some operating systems don't do it and on those you don't have a choice.

[1] ABI stands for Application Binary Interface. It's a set of rules for how a program interacts with the operating system. It defines how a program is stored on disk to be runnable by the operating system, how to do system calls, how to link with libraries, etc. But for being able to link to libraries for example, your program and the library have to agree on how to make function calls between your program an the library (and vice versa) and to be able to make function calls both the program and the library have to have the same idea of stack layout, register usage, function call conventions, etc. And for function calls you need to agree on what the parameters mean and that includes sizes, alignment and signedness of types.

like image 31
Art Avatar answered Oct 14 '22 05:10

Art


It is strictly, 100%, entirely the compiler that decides the value of sizeof(int). It is not a combination of the system and the compiler. It is just the compiler (and the C/C++ language specifications).

If you develop iPad or iPhone apps you do the compiler runs on your Mac. The Mac and the iPhone/iPac use different processors. Nothing about your Mac tells the compiler what size should be used for int on the iPad.

like image 8
user3344003 Avatar answered Oct 14 '22 05:10

user3344003


The processor designer determines what registers and instructions are available, what the alignment rules for efficient access are, how big memory addresses are and so-on.

The C standard sets minimum requirements for the built-in types. "char" must be at least 8 bit, "short" and "int" must be at least 16 bit, "long" must be at least 32 bit and "long long" must be at least 64 bit. It also says that "char" must be equivilent to the smallest unit of memory the program can address and that the size ordering of the standard types must be maintained.

Other standards may also have an impact. For example version 2 of the "single Unix specification" says that int must be at least 32-bits.

Finally existing code has an impact. Porting is hard enough already, noone wants to make it any harder than they have to.


When porting an OS and compiler to a new CPU someone has to define what is known of as a "C ABI". This defines how binary code talks to each other including.

  • The size and alignment requirements of the built-in types.
  • The packing rules for structures (and hence what their size will be).
  • How parameters are passed and returned
  • How the stack is managed

In general once and ABI is defined for a combination of CPU family and OS it doesn't change much (sometimes the size of more obscure types like "long double" changes). Changing it brings a bunch of breakage for relatively little gain.

Similarly those porting an OS to a platform with similar characteristics to an existing one will usually choose the same sizes as on previous platforms that the OS was ported to.


In practice OS/compiler vendors typically settle on one of a few combinations of sizes for the basic integer types.

  • "LP32": char is 8 bits. short and int are 16 bits, long and pointer are 32-bits. Commonly used on 8 bit and 16 bit platforms.
  • "ILP32": char is 8 bits, short is 16 bits. int, long and pointer are all 32 bits. If long long exists it is 64 bit. Commonly used on 32 bit platforms.
  • "LLP64": char is 8 bits. short is 16 bits. int and long are 32 bits. long long and pointer are 64 bits. Used on 64 bit windows.
  • "LP64": char is 8 bits. short is 16 bits. int is 32 bits. long, long long and pointer are 64 bits. Used on most 64-bit unix-like systems.
  • "ILP64": char is 8 bits, short is 16 bits, int, long and pointer and long long are all 64 bits. Apparently used on some early 64-bit operating systems but rarely seen nowadays.

64 bit processors can typically run both 32-bit and 64-bit binaries. Generally this is handled by having a compatibility layer in your OS. So your 32-bit binary uses the same data types it would use when running on a 32-bit system, then the compatibility layer translates the system calls so that the 64-bit OS can handle them.

like image 5
plugwash Avatar answered Oct 14 '22 06:10

plugwash