The x86-64 instruction set adds more registers and other improvements to help streamline executable code. However, in many applications the increased pointer size is a burden. The extra, unused bytes in every pointer clog up the cache and might even overflow RAM. GCC, for example, builds with the -m32
flag, and I assume this is the reason.
It's possible to load a 32-bit value and treat it as a pointer. This doesn't necessitate extra instructions, just load/compute the 32 bits and load from the resulting address. The trick won't be portable, though, as platforms have different memory maps. On Mac OS X, the entire low 4 GiB of address space is reserved. Still, for one program I wrote, hackishly adding 0x100000000L
to 32-bit "addresses" before use improved performance greatly over true 64-bit addresses, or compiling with -m32
.
Is there any fundamental impediment to having a 32-bit, x86-64 platform? I suppose that supporting such a chimera would add complexity to any operating system, and anyone wanting that last 20% should just Make it Work™, but it still seems that this would be the best fit for a variety of computationally intensive programs.
The term "x86" came into being because the names of several successors to Intel's 8086 processor end in "86", including the 80186, 80286, 80386 and 80486 processors.
Today, the term x86 is used generally to refer to any 32-bit processor compatible with the x86 instruction set. x86 microprocessor is capable of running almost any type of computer from laptops, servers, desktops, notebooks to supercomputers.
In technical terms, x86 and x64 refer to a processor family and the instruction set that they all utilize. It doesn't say anything about data sizes in particular. The term x86 refers to any instruction set derived from the Intel 8086 processor's instruction set.
The x86 moniker comes from the 32bit instruction set. So all x86 processors (without a leading 80 ) run the same 32 bit instruction set (and hence are all compatible). So x86 has become a defacto name for that set (and hence 32 bit). AMD's original 64 bit extension on the x86 set was called AMD64 .
There is an ABI called "x32" for linux in development. It's a mix between x86_64 and ia32 similar to what you describe - 32 bit address space while using the full 64 bit register set. It needs a custom kernel, binutils and gcc.
Some SPEC runs indicate a performace improvement of about 30% in some benchmarks. See further information at https://sites.google.com/site/x32abi/
As Mysticial commented above, ICC has the -auto-ilp32
/ /Qauto-ilp32
option to use 32-bit pointers in 64-bit mode:
Instructs the compiler to analyze the program to determine if there are 64-bit pointers that can be safely shrunk into 32-bit pointers and if there are 64-bit longs (on Linux* systems) that can be safely shrunk into 32-bit longs.
On Windows there's no x32abi like on Linux, but you can still use 32-bit pointers by disabling the /LARGEADDRESSAWARE
flag which is enabled for 64-bit binaries by default
By default, 64-bit Microsoft Windows-based applications have a user-mode address space of several terabytes. For precise values, see Memory Limits for Windows and Windows Server Releases. However, applications can specify that the system should allocate all memory for the application below 2 gigabytes. This feature is beneficial for 64-bit applications if the following conditions are true:
- A 2 GB address space is sufficient.
- The code has many pointer truncation warnings.
- Pointers and integers are freely mixed.
- The code has polymorphism using 32-bit data types.
All pointers are still 64-bit pointers, but the system ensures that every memory allocation occurs below the 2 GB limit, so that if the application truncates a pointer, no significant data is lost. Pointers can be truncated to 32-bit values, then extended to 64-bit values by either sign extension or zero extension.
Virtual Address Space
Of course there's no direct compiler support so you'll need to deal with pointers manually every time you store a pointer to memory or dereference it. The simplest solution is to write a class wrapping a 32-bit pointer to handle that
Google's V8 engine uses a different way by compressing pointers to 32 bits to save memory as well as improve performance. See the comparison in memory and performance improvement here
See also How does the compressed pointer implementation in V8 differ from JVM's compressed Oops?
Read more
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With