I wonder if it's a good idea to keep using int
(which is 32 bit on both x86 and x86_64) on 64 bit programs for variables that have nothing special and do not really need to span up to 2^64, like iteration counters, or if it's better to use size_t
which matches the word size of the CPU.
For sure if you keep using int
you save half of the memory, and that could mean something speaking about CPU cache, but I don't know then if on 64 bit machine every 32 bit number has to be extended to 64 bit before any use.
EDIT: I've ran some test with a program of mine (see the self answer, I still keep janneb's as accepted though because it is good). It turns out that there is a significant performance improvement.
A 32-bit system can access 232 different memory addresses, i.e 4 GB of RAM or physical memory ideally, it can access more than 4 GB of RAM also. A 64-bit system can access 264 different memory addresses, i.e actually 18-Quintillion bytes of RAM.
For this reason, running a 32 -bit application on a 64-bit operating system will not be the optimal choice in relation to performance. 32-bit drivers also may not be compatible with the 64-bit operating system. Users need to upgrade the driver's by switching to 64-bit when upgrading.
Because the machine is byte addressable as opposed to bit addressable; therefore adding 4 advances the pointer by 32 bits.
The difference in performance between 32-bit and 64-bit versions of applications depends greatly upon their types, and the data types they are processing. But in general you may expect a 2-20% performance gain from mere recompilation of a program - this is explained by architectural changes in 64-bit processors [1].
For array indices and pointer arithmetic, types which are of the same size as a pointer (typically, size_t and ptrdiff_t) can be better, as they avoid the need to zero or sign extend the register. Consider
float onei(float *a, int n)
{
return a[n];
}
float oneu(float *a, unsigned n)
{
return a[n];
}
float onep(float *a, ptrdiff_t n)
{
return a[n];
}
float ones(float *a, size_t n)
{
return a[n];
}
With GCC 4.4 -O2 on x86_64 the following asm is generated:
.p2align 4,,15
.globl onei
.type onei, @function
onei:
.LFB3:
.cfi_startproc
movslq %esi,%rsi
movss (%rdi,%rsi,4), %xmm0
ret
.cfi_endproc
.LFE3:
.size onei, .-onei
.p2align 4,,15
.globl oneu
.type oneu, @function
oneu:
.LFB4:
.cfi_startproc
mov %esi, %esi
movss (%rdi,%rsi,4), %xmm0
ret
.cfi_endproc
.LFE4:
.size oneu, .-oneu
.p2align 4,,15
.globl onep
.type onep, @function
onep:
.LFB5:
.cfi_startproc
movss (%rdi,%rsi,4), %xmm0
ret
.cfi_endproc
.LFE5:
.size onep, .-onep
.p2align 4,,15
.globl ones
.type ones, @function
ones:
.LFB6:
.cfi_startproc
movss (%rdi,%rsi,4), %xmm0
ret
.cfi_endproc
.LFE6:
.size ones, .-ones
As can be seen, the versions with the int and unsigned int index (onei and oneu) requires an extra instruction (movslq/mov) to sign/zero extend the register.
As was mentioned in a comment, the downside is that encoding a 64-bit register takes more space than the 32-bit part, bloating the code size. Secondly, ptrdiff_t/size_t variables need more memory than the equivalent int; if you have such arrays it can certainly affect performance much more than the relatively small benefit of avoiding the zero/sign extension. If unsure, profile!
In terms of Cache, it will save space; cache handles blocks of data, regardless of whether CPU requested a single address or the complete chunk equal to cache block size.
So if you are asking whether 32-bit numbers take 64-bits of space inside caches on 64 bit machines then the answer is no, they will still take 32 bits for themselves. So in general, it will save you some space, especially if you are using large arrays with frequent accesses etc.
In my personal opinion, a simple int
looks simpler than size_t
and most editors will not recognize size_t
type so syntax highlighting will also be better if you use int
. ;)
I am coding a little hard spheres model. The source can be found on github.
I tried to keep using size_t
for variables that are used as index of arrays, and int
where I do other operations, not related to word size. The performance improvement was significant: a ~27 to ~24 execution time drop.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With