Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is this Rust program so slow? Did I miss something?

Tags:

rust

I read Minimal distance in Manhattan metric and rewrote the author's "naive" implementation in Rust. The C++ variant is:

#include <utility>
#include <cstdio>
#include <cstdlib>

std::pair<int, int> pointsA[1000001];
std::pair<int, int> pointsB[1000001];

int main() {
    int n, t;
    unsigned long long dist;

    scanf("%d", &t);

    while(t-->0) {
        dist = 4000000000LL;
        scanf("%d", &n);

        for(int i = 0; i < n; i++) {
            scanf("%d%d", &pointsA[i].first, &pointsA[i].second);
        }

        for(int i = 0; i < n; i++) {
            scanf("%d%d", &pointsB[i].first, &pointsB[i].second);
        }

        for(int i = 0; i < n ;i++) {
            for(int j = 0; j < n ; j++) {
                if(abs(pointsA[i].first - pointsB[j].first) + abs(pointsA[i].second - pointsB[j].second) < dist)
                    dist = abs(pointsA[i].first - pointsB[j].first) + abs(pointsA[i].second - pointsB[j].second);
            }
        }
        printf("%lld\n", dist);
    }
}

The Rust variant is:

use std::io;
use std::io::BufReader;
use std::io::BufRead;

fn read_array(stdin: &mut BufReader<io::Stdin>, array_len: usize, points: &mut Vec<(i32, i32)>) {
    let mut line = String::new();
    for _ in 0..array_len {
        line.clear();
        stdin.read_line(&mut line).unwrap();
        let mut item = line.split_whitespace();
        let x = item.next().unwrap().parse().unwrap();
        let y = item.next().unwrap().parse().unwrap();
        points.push((x, y));
    }
}

fn manhattan_dist(a: &(i32, i32), b: &(i32, i32)) -> u32 {
    ((a.0 - b.0).abs() + (a.1 - b.1).abs()) as u32
}

fn main() {
    let mut line = String::new();
    let mut stdin = BufReader::new(io::stdin());
    stdin.read_line(&mut line).unwrap();
    let n_iters = line.trim_right().parse::<usize>().unwrap();
    let mut points_a = Vec::with_capacity(10000);
    let mut points_b = Vec::with_capacity(10000);
    for _ in 0..n_iters {
        line.clear();
        stdin.read_line(&mut line).unwrap();
        let set_len = line.trim_right().parse::<usize>().unwrap();
        points_a.clear();
        points_b.clear();
        read_array(&mut stdin, set_len, &mut points_a);
        read_array(&mut stdin, set_len, &mut points_b);
        let mut dist = u32::max_value();
        for i in points_a.iter() {
            for j in points_b.iter() {
                dist = std::cmp::min(manhattan_dist(i, j), dist);
            }
        }
        println!("{}", dist);
    }
}

Then, I generated data with a Python script:

import random

ITER = 100
N = 10000
MAX_INT = 1000000

print("%d" % ITER)

for _ in range(0, ITER):
    print("%d" % N)
    for _ in range(0, N):
        print(random.randrange(-MAX_INT, MAX_INT + 1), random.randrange(1, MAX_INT + 1))
    for _ in range(0, N):
        print(random.randrange(-MAX_INT, MAX_INT + 1), random.randrange(-MAX_INT, 0))

And compiled both variants with g++ -Ofast -march=native and rustc -C opt-level=3 respectively. The timings are:

C++

real    0m7.789s
user    0m7.760s
sys     0m0.020s

Rust

real    0m28.589s
user    0m28.570s
sys     0m0.010s

Why is my Rust code four times slower than the C++ variant? I am using Rust 1.12.0-beta.1.

I added time measurements:

let now = SystemTime::now();
line.clear();
stdin.read_line(&mut line).unwrap();
let set_len = line.trim_right().parse::<usize>().unwrap();
points_a.clear();
points_b.clear();
read_array(&mut stdin, set_len, &mut points_a);
read_array(&mut stdin, set_len, &mut points_b);
io_time += now.elapsed().unwrap();

let now = SystemTime::now();
let mut dist = u32::max_value();
for i in points_a.iter() {
    for j in points_b.iter() {
        dist = std::cmp::min(manhattan_dist(i, j), dist);
    }
}
calc_time += now.elapsed().unwrap();

And writeln!(&mut std::io::stderr(), "io_time: {}, calc_time: {}", io_time.as_secs(), calc_time.as_secs()).unwrap(); prints io_time: 0, calc_time: 27.

I tried nightly rustc 1.13.0-nightly (e9bc1bac8 2016-08-24):

$ time ./test_rust < data.txt  > test3_res
io_time: 0, calc_time: 19

real    0m19.592s
user    0m19.560s
sys     0m0.020s
$ time ./test1 < data.txt  > test1_res

real    0m7.797s
user    0m7.780s
sys     0m0.010s

So it is at now ~ 2.7x difference on my Core i7.

like image 201
user1244932 Avatar asked Aug 27 '16 23:08

user1244932


People also ask

Why is Rust compiling slowly?

That is because the languages are different and choose different tradeoffs. The tradeoffs might make a language inapplicable in your domain. Rust has always been marketed as a "systems programming language". C++ is Rust's closes competitor in this domain space and it suffers from terribly slow compiling as well.

Is Rust slow to develop in?

Rust's compile times are notoriously slow. Rust development was slow enough on my laptop that I finally gave up on mobile computing and bought a desktop with a top-of-the line CPU (12900K). Along the way I switched from Windows to Linux (more on that later) and started using the mold linker, and now… things are OK!

Is Rust a slow language?

Moreover, it is slow. Rust is a snail compared to other languages. Even for small projects, the compile times are painfully long, and runtime measurements show that Rust is less efficient than the C programs.

Why use Rust over c++?

You can easily notice similarities between Rust and C++ syntax, but Rust offers a higher level of memory safety without using a garbage collector. Not for the first time, Rust has been named the most loved language—it gained more than 86% of developers' votes.


2 Answers

The difference is of course -march=native... kind of. Rust has this through -C target_cpu=native, but this doesn't give the same speed benefit. This is because LLVM is unwilling to vectorize in this context, whereas GCC does. You may note that using Clang, a C++ compiler that also uses LLVM, also produces relatively slow code.

To encourage LLVM to vectorize, you can move the main loop into a separate function. Alternatively, you can use a local block. If you write the code carefully as

let dist = {
    let mut dist = i32::max_value();
    for &(a, b) in &points_a[..n] {
        for &(c, d) in &points_b[..n] {
            dist = std::cmp::min(((a - c).abs() + (b - d).abs()), dist);
        }
    }
    dist
} as u32;

the difference between Rust and C++ is then near-negligible (~4%).

like image 152
Veedrac Avatar answered Oct 08 '22 00:10

Veedrac


The vast majority of the performance you're seeing in C++ is due to the flag -march=native.

This flag is not the equivalent flag to Rust's --release. It uses CPU instructions specific to the CPU it is compiled on, so math in particular is going to be way faster.

Removing that flag puts the C++ code at 19 seconds.

Then there's the unsafety present in the C++ code. None of the input is checked. The Rust code does check it, you use .unwrap()unwrap has a performance cost, there's an assertion, then the code necessary for unwinding, etc.

Using if lets instead of raw unwraps, or ignoring results where possible, brings the Rust code down again.

Rust: 22 seconds

C++: 19 seconds

Where's the 3 seconds coming from? A bit of playing around leads me to believe it's println! vs. printf, but I don't have hard numbers for the C++ code. What I can say is that the Rust code drops to 13 seconds when I perform the printing outside of the benchmark.

TLDR: Your compiler flags are different, and your C++ code is not safe.

like image 21
IBit Avatar answered Oct 08 '22 02:10

IBit