I wanted to learn a bit about rust tasks, so I did a monte carlo computation of PI. Now my puzzle is why the single-threaded C version is 4 times faster than the 4-way threaded Rust version. Clearly I am doing something wrong, or my mental performance model is way off.
Here's the C version:
#include <stdlib.h>
#include <sys/types.h>
#include <unistd.h>
#include <stdio.h>
#define PI 3.1415926535897932
double monte_carlo_pi(int nparts)
{
int i, in=0;
double x, y;
srand(getpid());
for (i=0; i<nparts; i++) {
x = (double)rand()/(double)RAND_MAX;
y = (double)rand()/(double)RAND_MAX;
if (x*x + y*y < 1.0) {
in++;
}
}
return in/(double)nparts * 4.0;
}
int main(int argc, char **argv)
{
int nparts;
double mc_pi;
nparts = atoi(argv[1]);
mc_pi = monte_carlo_pi(nparts);
printf("computed: %f error: %f\n", mc_pi, mc_pi - PI);
}
The Rust version was not a line-by-line port:
use std::rand;
use std::rand::distributions::{IndependentSample,Range};
fn monte_carlo_pi(nparts: uint ) -> uint {
let between = Range::new(0f64,1f64);
let mut rng = rand::task_rng();
let mut in_circle = 0u;
for _ in range(0u, nparts) {
let a = between.ind_sample(&mut rng);
let b = between.ind_sample(&mut rng);
if a*a + b*b <= 1.0 {
in_circle += 1;
}
}
in_circle
}
fn main() {
let (tx, rx) = channel();
let ntasks = 4u;
let nparts = 100000000u; /* I haven't learned how to parse cmnd line args yet!*/
for _ in range(0u, ntasks) {
let child_tx = tx.clone();
spawn(proc() {
child_tx.send(monte_carlo_pi(nparts/ntasks));
});
}
let result = rx.recv() + rx.recv() + rx.recv() + rx.recv();
println!("pi is {}", (result as f64)/(nparts as f64)*4.0);
}
Build and time the C version:
$ clang -O2 mc-pi.c -o mc-pi-c; time ./mc-pi-c 100000000
computed: 3.141700 error: 0.000108
./mc-pi-c 100000000 1.68s user 0.00s system 99% cpu 1.683 total
Build and time the Rust version:
$ rustc -v
rustc 0.12.0-nightly (740905042 2014-09-29 23:52:21 +0000)
$ rustc --opt-level 2 --debuginfo 0 mc-pi.rs -o mc-pi-rust; time ./mc-pi-rust
pi is 3.141327
./mc-pi-rust 2.40s user 24.56s system 352% cpu 7.654 tota
As far as speed/performance, Rust is on the same page as C++. There are situations where it is easier to write faster programs in C++ because it's easy to ignore fundamental problems in the programs. From this small sample size, it's clear that both are fast.
However, Rust programs also optimize quite well, sometimes better than C. While C is good for writing minimal code on byte-by-byte pointer-by-pointer level, Rust has powerful features for efficiently combining multiple functions or even whole libraries together.
Rust without safety checks The resulting transpiled code forwards a packet in only 91 cycles while executing 121 instructions, that's faster than the original code. It holds true even when the original code is compiled with clang and the same LLVM version.
Rust was created to provide high performance, comparable to C and C++, with a strong emphasis on the code's safety. C compilers don't really care about safety. This means programmers need to take care not to write a program that causes memory violation or data races.
The bottleneck, as Dogbert observed, was the random number generator. Here's one that is fast and seeded differently on each thread
fn monte_carlo_pi(id: u32, nparts: uint ) -> uint {
...
let mut rng: XorShiftRng = SeedableRng::from_seed([id,id,id,id]);
...
}
Meaningful benchmarks are a tricky thing, because you have all kinds of optimization options, etc. Also, the structure of the code can have a huge impact.
Comparing C and Rust is a little like comparing apples and oranges. We typically use compute-intensive algorithms like the one you dispicit above, but the real world can throw you a curve.
Having said that, in general, Rust can and does approach the peformance of C and C++, and most likey can do better on concurrency tasks in general.
Take a look at the benchmarks here:
https://benchmarksgame-team.pages.debian.net/benchmarksgame/fastest/rust-clang.html
I chose the Rust vs. C Clang benchmark comparasion, because both rely on the underlying LLVM.
On the other hand, a comparasion with C gcc yields different results:
And guess what? Rust still comes out ahead!
I entreat you to explore the Benchmark Game site in more detail. There are some cases where C will edge out Rust in some instances.
In general, when you are creating a real-world solution, you want to do performance benchmarks for your specific cases. Always do this, because you will often be surprised by the results. Never assume.
I think that too many times, benchmarks are used to forward the "my language is better than your langage" style of rwars. But as one who have used over 20 computer languages throughout his longish career, I always say that it is a matter of the best tool for the job.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With