Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Should I use i32 or i64 on 64bit machine?

Tags:

integer

rust

main.rs

#![feature(core_intrinsics)]
fn print_type_of<T>(_: &T) {
    println!("{}", unsafe { std::intrinsics::type_name::<T>() });
}

fn main() {
    let x = 93;
    let y = 93.1;

    print_type_of(&x);
    print_type_of(&y);
}

If I compile with "rustc +nightly ./main.rs", i got this output:

$ ./main

i32
f64

I run a x86_64 Linux machine. Floating point variables are double precision by default, which is good. Why integers are only 4 bytes? Which should I use? If I don't need i64 should I use i32? Are i32 better for performance?

like image 491
Giorgio Napolitano Avatar asked Aug 04 '18 12:08

Giorgio Napolitano


1 Answers

Are i32 better for performance?

That's actually kind of a subtle thing. If we look up some recent instruction-level benchmarks for example for SkylakeX, there is for the most part a very clear lack of difference between 64bit and 32bit instructions. An exception to that is division, 64bit division is slower than 32bit division, even when dividing the same values (division is one of the few variable-time instructions that depend on the values of its inputs).

Using i64 for data also makes auto-vectorization less effective - this is also one of the rare places where data smaller than 32bit has a use beyond data-size optimization. Of course the data size also matters for the i32 vs i64 question, working with sizable arrays of i64's can easily be slower just because it's bigger, therefore costing more space in the caches and (if applicable) more bandwidth. So if the question is [i32] vs [i64], then it matters.

Even more subtle is the fact that using 64bit operations means that the code will contains more REX prefixes on average, making the code slightly less dense meaning that less of it will fit in the L1 code cache at once. This is a small effect though. Just having some 64bit variables in the code is not a problem.

Despite all that, definitely don't overuse i32, especially in places where you should really have an usize. For example, do not do this:

// don't do this
for i in 0i32 .. data.len() as i32 { 
  sum += data[i as usize]; 
}

This causes a large performance regression. Not only is there a pointless sign-extension in the loop now, it also defeats bounds check elimination and auto-vectorization. But of course there is no reason to write code like that in the first place, it's unnatural and harder than doing it right.

like image 75
harold Avatar answered Oct 19 '22 01:10

harold