Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is `iter().map().sum()` as fast as `iter().fold()`?

Tags:

rust

Does the compiler generate the same code for iter().map().sum() and iter().fold()? In the end they achieve the same goal, but the first code would iterate two times, once for the map and once for the sum.

Here is an example. Which version would be faster in total?

pub fn square(s: u32) -> u64 {
    match s {
        s @ 1...64 => 2u64.pow(s - 1),
        _ => panic!("Square must be between 1 and 64")
    }
}

pub fn total() -> u64 {
    // A fold
    (0..64).fold(0u64, |r, s| r + square(s + 1))
    // or a map
    (1..64).map(square).sum()
}

What would be good tools to look at the assembly or benchmark this?

like image 213
Max Linke Avatar asked Nov 05 '16 17:11

Max Linke


People also ask

How do you know if an iterator is empty in Rust?

You can make your iterator peekable and peek the first item; if it's None , then the iterator is empty.

What is iterator in Rust?

An iterator is responsible for the logic of iterating over each item and determining when the sequence has finished. When you use iterators, you don't have to reimplement that logic yourself. In Rust, iterators are lazy, meaning they have no effect until you call methods that consume the iterator to use it up.

What is Rust collect?

In Rust programs, iterators are often used to modify or generate data. But to have a more usable collection, like a Vec or String, we must convert the iterator. With collect, we have a powerful built-in function that converts an iterator into a type like a Vec or String. Collect can perform many conversions.


1 Answers

For them to generate the same code, they'd first have to do the same thing. Your two examples do not:

fn total_fold() -> u64 {
    (0..64).fold(0u64, |r, s| r + square(s + 1))
}

fn total_map() -> u64 {
    (1..64).map(square).sum()
}

fn main() {
    println!("{}", total_fold());
    println!("{}", total_map());
}
18446744073709551615
9223372036854775807

Let's assume you meant

fn total_fold() -> u64 {
    (1..64).fold(0u64, |r, s| r + square(s + 1))
}

fn total_map() -> u64 {
    (1..64).map(|i| square(i + 1)).sum()
}

There are a few avenues to check:

  1. The generated LLVM IR
  2. The generated assembly
  3. Benchmark

The easiest source for the IR and assembly is one of the playgrounds (official or alternate). These both have buttons to view the assembly or IR. You can also pass --emit=llvm-ir or --emit=asm to the compiler to generate these files.

Make sure to generate assembly or IR in release mode. The attribute #[inline(never)] is often useful to keep functions separate to find them easier in the output.

Benchmarking is documented in The Rust Programming Language, so there's no need to repeat all that valuable information.


Before Rust 1.14, these do not produce the exact same assembly. I'd wait for benchmarking / profiling data to see if there's any meaningful impact on performance before I worried.

As of Rust 1.14, they do produce the same assembly! This is one reason I love Rust. You can write clear and idiomatic code and smart people come along and make it equally as fast.

but the first code would iterate two times, once for the map and once for the sum.

This is incorrect, and I'd love to know what source told you this so we can go correct it at that point and prevent future misunderstandings. An iterator operates on a pull basis; one element is processed at a time. The core method is next, which yields a single value, running just enough computation to produce that value.

like image 56
Shepmaster Avatar answered Sep 27 '22 19:09

Shepmaster