Why is this Rust program so slow? Did I miss something?

Tags:

rust

I read Minimal distance in Manhattan metric and rewrote the author's "naive" implementation in Rust. The C++ variant is:

#include <utility>
#include <cstdio>
#include <cstdlib>

std::pair<int, int> pointsA[1000001];
std::pair<int, int> pointsB[1000001];

int main() {
    int n, t;
    unsigned long long dist;

    scanf("%d", &t);

    while(t-->0) {
        dist = 4000000000LL;
        scanf("%d", &n);

        for(int i = 0; i < n; i++) {
            scanf("%d%d", &pointsA[i].first, &pointsA[i].second);
        }

        for(int i = 0; i < n; i++) {
            scanf("%d%d", &pointsB[i].first, &pointsB[i].second);
        }

        for(int i = 0; i < n ;i++) {
            for(int j = 0; j < n ; j++) {
                if(abs(pointsA[i].first - pointsB[j].first) + abs(pointsA[i].second - pointsB[j].second) < dist)
                    dist = abs(pointsA[i].first - pointsB[j].first) + abs(pointsA[i].second - pointsB[j].second);
            }
        }
        printf("%lld\n", dist);
    }
}

The Rust variant is:

use std::io;
use std::io::BufReader;
use std::io::BufRead;

fn read_array(stdin: &mut BufReader<io::Stdin>, array_len: usize, points: &mut Vec<(i32, i32)>) {
    let mut line = String::new();
    for _ in 0..array_len {
        line.clear();
        stdin.read_line(&mut line).unwrap();
        let mut item = line.split_whitespace();
        let x = item.next().unwrap().parse().unwrap();
        let y = item.next().unwrap().parse().unwrap();
        points.push((x, y));
    }
}

fn manhattan_dist(a: &(i32, i32), b: &(i32, i32)) -> u32 {
    ((a.0 - b.0).abs() + (a.1 - b.1).abs()) as u32
}

fn main() {
    let mut line = String::new();
    let mut stdin = BufReader::new(io::stdin());
    stdin.read_line(&mut line).unwrap();
    let n_iters = line.trim_right().parse::<usize>().unwrap();
    let mut points_a = Vec::with_capacity(10000);
    let mut points_b = Vec::with_capacity(10000);
    for _ in 0..n_iters {
        line.clear();
        stdin.read_line(&mut line).unwrap();
        let set_len = line.trim_right().parse::<usize>().unwrap();
        points_a.clear();
        points_b.clear();
        read_array(&mut stdin, set_len, &mut points_a);
        read_array(&mut stdin, set_len, &mut points_b);
        let mut dist = u32::max_value();
        for i in points_a.iter() {
            for j in points_b.iter() {
                dist = std::cmp::min(manhattan_dist(i, j), dist);
            }
        }
        println!("{}", dist);
    }
}

Then, I generated data with a Python script:

import random

ITER = 100
N = 10000
MAX_INT = 1000000

print("%d" % ITER)

for _ in range(0, ITER):
    print("%d" % N)
    for _ in range(0, N):
        print(random.randrange(-MAX_INT, MAX_INT + 1), random.randrange(1, MAX_INT + 1))
    for _ in range(0, N):
        print(random.randrange(-MAX_INT, MAX_INT + 1), random.randrange(-MAX_INT, 0))

And compiled both variants with g++ -Ofast -march=native and rustc -C opt-level=3 respectively. The timings are:

C++

real    0m7.789s
user    0m7.760s
sys     0m0.020s

Rust

real    0m28.589s
user    0m28.570s
sys     0m0.010s

Why is my Rust code four times slower than the C++ variant? I am using Rust 1.12.0-beta.1.

I added time measurements:

let now = SystemTime::now();
line.clear();
stdin.read_line(&mut line).unwrap();
let set_len = line.trim_right().parse::<usize>().unwrap();
points_a.clear();
points_b.clear();
read_array(&mut stdin, set_len, &mut points_a);
read_array(&mut stdin, set_len, &mut points_b);
io_time += now.elapsed().unwrap();

let now = SystemTime::now();
let mut dist = u32::max_value();
for i in points_a.iter() {
    for j in points_b.iter() {
        dist = std::cmp::min(manhattan_dist(i, j), dist);
    }
}
calc_time += now.elapsed().unwrap();

And writeln!(&mut std::io::stderr(), "io_time: {}, calc_time: {}", io_time.as_secs(), calc_time.as_secs()).unwrap(); prints io_time: 0, calc_time: 27.

I tried nightly rustc 1.13.0-nightly (e9bc1bac8 2016-08-24):

$ time ./test_rust < data.txt  > test3_res
io_time: 0, calc_time: 19

real    0m19.592s
user    0m19.560s
sys     0m0.020s
$ time ./test1 < data.txt  > test1_res

real    0m7.797s
user    0m7.780s
sys     0m0.010s

So it is at now ~ 2.7x difference on my Core i7.

201

asked Aug 27 '16 23:08

2 Answers

The difference is of course -march=native... kind of. Rust has this through -C target_cpu=native, but this doesn't give the same speed benefit. This is because LLVM is unwilling to vectorize in this context, whereas GCC does. You may note that using Clang, a C++ compiler that also uses LLVM, also produces relatively slow code.

To encourage LLVM to vectorize, you can move the main loop into a separate function. Alternatively, you can use a local block. If you write the code carefully as

let dist = {
    let mut dist = i32::max_value();
    for &(a, b) in &points_a[..n] {
        for &(c, d) in &points_b[..n] {
            dist = std::cmp::min(((a - c).abs() + (b - d).abs()), dist);
        }
    }
    dist
} as u32;

the difference between Rust and C++ is then near-negligible (~4%).

152

answered Oct 08 '22 00:10

Veedrac

The vast majority of the performance you're seeing in C++ is due to the flag -march=native.

This flag is not the equivalent flag to Rust's --release. It uses CPU instructions specific to the CPU it is compiled on, so math in particular is going to be way faster.

Removing that flag puts the C++ code at 19 seconds.

Then there's the unsafety present in the C++ code. None of the input is checked. The Rust code does check it, you use .unwrap() – unwrap has a performance cost, there's an assertion, then the code necessary for unwinding, etc.

Using if lets instead of raw unwraps, or ignoring results where possible, brings the Rust code down again.

Rust: 22 seconds

C++: 19 seconds

Where's the 3 seconds coming from? A bit of playing around leads me to believe it's println! vs. printf, but I don't have hard numbers for the C++ code. What I can say is that the Rust code drops to 13 seconds when I perform the printing outside of the benchmark.

TLDR: Your compiler flags are different, and your C++ code is not safe.

answered Oct 08 '22 02:10

IBit

Related questions
                            
                                How to create and write to memory mapped files?
                            
                                Is there another option to share an Arc in multiple closures besides cloning it before each closure?
                            
                                How to get file path without extension in Rust?
                            
                                Is there any way to get the address of a `struct` in Rust?
                            
                                How do I automatically implement comparison for structs with floats in Rust?
                            
                                Understanding struct-field mutation
                            
                                How to match on data type in Rust?
                            
                                How do I include the end value in a range?
                            
                                Can array lengths be inferred in Rust?
                            
                                OpenSSL crate fails compilation on Mac OS X 10.11
                            
                                How can I 0-pad a number by a variable amount when formatting with std::fmt?
                            
                                Why does Iterator::take_while take ownership of the iterator?
                            
                                Why does Rust not allow the copy and drop traits on one type?
                            
                                How do I process enum/struct/field attributes in a procedural macro?
                            
                                Check if string ends with given suffix in Rust
                            
                                How to deserialize a JSON file which contains null values using Serde?
                            
                                How does Rust guarantee memory safety and prevent segfaults?
                            
                                How can I install Rust using rustup on Windows when behind a firewall?
                            
                                How to idiomatically alias a crate in Rust 2018?
                            
                                What is the difference between using the box keyword and Box::new?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With