Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Chaining checked arithmetic operations in Rust

When doing integer arithmetic with checks for overflows, calculations often need to compose several arithmetic operations. A straightforward way of chaining checked arithmetic in Rust uses checked_* methods and Option chaining:

fn calculate_size(elem_size: usize,
                  length: usize,
                  offset: usize)
                  -> Option<usize> {
    elem_size.checked_mul(length)
             .and_then(|acc| acc.checked_add(offset))
}

However, this tells the compiler to generate a branch per each elementary operation. I have encountered a more unrolled approach using overflowing_* methods:

fn calculate_size(elem_size: usize,
                  length: usize,
                  offset: usize)
                  -> Option<usize> {
    let (acc, oflo1) = elem_size.overflowing_mul(length);
    let (acc, oflo2) = acc.overflowing_add(offset);
    if oflo1 | oflo2 {
        None
    } else {
        Some(acc)
    }
}

Continuing computation regardless of overflows and aggregating the overflow flags with bitwise OR ensures that at most one branching is performed in the entire evaluation (provided that the implementations of overflowing_* generate branchless code). This optimization-friendly approach is more cumbersome and requires some caution in dealing with intermediate values.

Does anyone have experience with how the Rust compiler optimizes either of the patterns above on various CPU architectures, to tell whether the explicit unrolling is worthwhile, especially for more complex expressions?

like image 678
mzabaluev Avatar asked Mar 20 '16 11:03

mzabaluev


1 Answers

Does anyone have experience with how the Rust compiler optimizes either of the patterns above on various CPU architectures, to tell whether the explicit unrolling is worthwhile, especially for more complex expressions?

You can use the playground to check how LLVM optimizes things: just click on "LLVM IR" or "ASM" instead of "Run". Stick a #[inline(never)] on the function you wish to check, and pay attention to pass it run-time arguments, to avoid constant folding. As in here:

use std::env;

#[inline(never)]
fn calculate_size(elem_size: usize,
                  length: usize,
                  offset: usize)
                  -> Option<usize> {
    let (acc, oflo1) = elem_size.overflowing_mul(length);
    let (acc, oflo2) = acc.overflowing_add(offset);
    if oflo1 | oflo2 {
        None
    } else {
        Some(acc)
    }
}

fn main() {
    let vec: Vec<usize> = env::args().map(|s| s.parse().unwrap()).collect();
    let result = calculate_size(vec[0], vec[1], vec[2]);
    println!("{:?}",result);
}

The answer you'll get, however, is that the overflow intrinsics in Rust and LLVM have been coded for convenience and not performance, unfortunately. This means that while the explicit unrolling optimizes well, counting on LLVM to optimize the checked code is not realistic for now.

Normally this is not an issue; but for a performance hotspot, you may want to unroll manually.

Note: this lack of performance is also the reason that overflow checking is disabled by default in Release mode.

like image 60
Matthieu M. Avatar answered Oct 23 '22 04:10

Matthieu M.