Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

32-bit pointers with the x86-64 ISA: why not?

The x86-64 instruction set adds more registers and other improvements to help streamline executable code. However, in many applications the increased pointer size is a burden. The extra, unused bytes in every pointer clog up the cache and might even overflow RAM. GCC, for example, builds with the -m32 flag, and I assume this is the reason.

It's possible to load a 32-bit value and treat it as a pointer. This doesn't necessitate extra instructions, just load/compute the 32 bits and load from the resulting address. The trick won't be portable, though, as platforms have different memory maps. On Mac OS X, the entire low 4 GiB of address space is reserved. Still, for one program I wrote, hackishly adding 0x100000000L to 32-bit "addresses" before use improved performance greatly over true 64-bit addresses, or compiling with -m32.

Is there any fundamental impediment to having a 32-bit, x86-64 platform? I suppose that supporting such a chimera would add complexity to any operating system, and anyone wanting that last 20% should just Make it Work™, but it still seems that this would be the best fit for a variety of computationally intensive programs.

like image 588
Potatoswatter Avatar asked Feb 10 '12 19:02

Potatoswatter


People also ask

Why is x86 x86 not x32?

The term "x86" came into being because the names of several successors to Intel's 8086 processor end in "86", including the 80186, 80286, 80386 and 80486 processors.

Is x86 compatible with 32-bit?

Today, the term x86 is used generally to refer to any 32-bit processor compatible with the x86 instruction set. x86 microprocessor is capable of running almost any type of computer from laptops, servers, desktops, notebooks to supercomputers.

Why is 32-bit x86 called 64-bit x64?

In technical terms, x86 and x64 refer to a processor family and the instruction set that they all utilize. It doesn't say anything about data sizes in particular. The term x86 refers to any instruction set derived from the Intel 8086 processor's instruction set.

Why does x86 represent 32-bit?

The x86 moniker comes from the 32bit instruction set. So all x86 processors (without a leading 80 ) run the same 32 bit instruction set (and hence are all compatible). So x86 has become a defacto name for that set (and hence 32 bit). AMD's original 64 bit extension on the x86 set was called AMD64 .


2 Answers

There is an ABI called "x32" for linux in development. It's a mix between x86_64 and ia32 similar to what you describe - 32 bit address space while using the full 64 bit register set. It needs a custom kernel, binutils and gcc.

Some SPEC runs indicate a performace improvement of about 30% in some benchmarks. See further information at https://sites.google.com/site/x32abi/

like image 137
Gunther Piez Avatar answered Oct 31 '22 23:10

Gunther Piez


As Mysticial commented above, ICC has the -auto-ilp32 / /Qauto-ilp32 option to use 32-bit pointers in 64-bit mode:

Instructs the compiler to analyze the program to determine if there are 64-bit pointers that can be safely shrunk into 32-bit pointers and if there are 64-bit longs (on Linux* systems) that can be safely shrunk into 32-bit longs.


On Windows there's no x32abi like on Linux, but you can still use 32-bit pointers by disabling the /LARGEADDRESSAWARE flag which is enabled for 64-bit binaries by default

By default, 64-bit Microsoft Windows-based applications have a user-mode address space of several terabytes. For precise values, see Memory Limits for Windows and Windows Server Releases. However, applications can specify that the system should allocate all memory for the application below 2 gigabytes. This feature is beneficial for 64-bit applications if the following conditions are true:

  • A 2 GB address space is sufficient.
  • The code has many pointer truncation warnings.
  • Pointers and integers are freely mixed.
  • The code has polymorphism using 32-bit data types.

All pointers are still 64-bit pointers, but the system ensures that every memory allocation occurs below the 2 GB limit, so that if the application truncates a pointer, no significant data is lost. Pointers can be truncated to 32-bit values, then extended to 64-bit values by either sign extension or zero extension.

Virtual Address Space

Of course there's no direct compiler support so you'll need to deal with pointers manually every time you store a pointer to memory or dereference it. The simplest solution is to write a class wrapping a 32-bit pointer to handle that


Google's V8 engine uses a different way by compressing pointers to 32 bits to save memory as well as improve performance. See the comparison in memory and performance improvement here

See also How does the compressed pointer implementation in V8 differ from JVM's compressed Oops?


Read more

  • How to use 32-bit pointers in 64-bit application?
  • Can a C compiler generate an executable 64-bits where pointers are 32-bits?
like image 24
phuclv Avatar answered Oct 31 '22 21:10

phuclv