Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

size of CPU register

It's typically better to use CPU registers to their full capacity. For a portable piece of code, it means using 64-bits arithmetic and storage on 64-bits CPU, and only 32-bits on 32-bits CPU (otherwise, 64-bits instructions will be emulated in 32-bits mode, resulting in devastating performances).

That means it's necessary to detect the size of CPU registers, typically at compile-time (since runtime tests are expensive).

For years now, I've used the simple heuristic sizeof(nativeRegisters) == sizeof(size_t).

It has worked fine for a lot of platforms, but it appears to be a wrong heuristic for linux x32 : in this case, size_t is only 32-bits, while registers could still handle 64-bits. It results in some lost performance opportunity (significant for my use case).

I would like to correctly detect the usable size of CPU registers even in such a situation.

I suspect I could try to find some compiler-specific macro to special-case x32 mode. But I was wondering if something more generic would exist, to cover more situations. For example another target would be OpenVMS 64-bits : there, native register size is 64-bits, but size_t is only 32-bits.

like image 740
Cyan Avatar asked Apr 30 '16 08:04

Cyan


1 Answers

There is no reliable and portable way to determine register size from C. C doesn't even have a concept of "registers" (the description of the register keyword doesn't mention CPU registers).

But it does define a set of integer types that are the fastest type of at least a specified size. <stdint.h> defines uint_fastN_t, for N = 8, 16, 32, 64.

If you're assuming that registers are at least 32 bits, then uint_fast32_t is likely to be the same size as a register, either 32 or 64 bits. This isn't guaranteed. Here's what the standard says:

Each of the following types designates an integer type that is usually fastest to operate with among all integer types that have at least the specified width.

with a footnote:

The designated type is not guaranteed to be fastest for all purposes; if the implementation has no clear grounds for choosing one type over another, it will simply pick some integer type satisfying the signedness and width requirements.

In fact, I suggest that using the [u]int_fastN_t types expresses your intent more clearly than trying to match the CPU register size.

If that doesn't work for some target, you'll need to add some special-case #if or #ifdef directives to choose a suitable type. But uint_fast32_t (or uint_fast16_t if you want to support 16-bit systems) is probably a better starting point than size_t or int.

A quick experiment shows that if I compile with gcc -mx32, both uint_fast16_t and uint_fast32_t are 32 bits. They're both 64 bits when compiled without -mx32 (on my x86_64 system). Which means that, at least for gcc, the uint_fastN_t types don't do what you want. You'll need special-case code for x32. (Arguably gcc should be using 64-bit types for uint_fastN_t in x32 mode. I've just posted this question asking about that.)

This question asks how to detect an x32 environment in the preprocessor. gcc provides no direct way to determine this, but I've just posted an answer suggesting the use of the __x86_64__ and SIZE_MAX macros.

like image 103
Keith Thompson Avatar answered Oct 29 '22 12:10

Keith Thompson