Does the compiler generate the same code for iter().map().sum()
and iter().fold()
? In the end they achieve the same goal, but the first code would iterate two times, once for the map
and once for the sum
.
Here is an example. Which version would be faster in total
?
pub fn square(s: u32) -> u64 {
match s {
s @ 1...64 => 2u64.pow(s - 1),
_ => panic!("Square must be between 1 and 64")
}
}
pub fn total() -> u64 {
// A fold
(0..64).fold(0u64, |r, s| r + square(s + 1))
// or a map
(1..64).map(square).sum()
}
What would be good tools to look at the assembly or benchmark this?
You can make your iterator peekable and peek the first item; if it's None , then the iterator is empty.
An iterator is responsible for the logic of iterating over each item and determining when the sequence has finished. When you use iterators, you don't have to reimplement that logic yourself. In Rust, iterators are lazy, meaning they have no effect until you call methods that consume the iterator to use it up.
In Rust programs, iterators are often used to modify or generate data. But to have a more usable collection, like a Vec or String, we must convert the iterator. With collect, we have a powerful built-in function that converts an iterator into a type like a Vec or String. Collect can perform many conversions.
For them to generate the same code, they'd first have to do the same thing. Your two examples do not:
fn total_fold() -> u64 {
(0..64).fold(0u64, |r, s| r + square(s + 1))
}
fn total_map() -> u64 {
(1..64).map(square).sum()
}
fn main() {
println!("{}", total_fold());
println!("{}", total_map());
}
18446744073709551615
9223372036854775807
Let's assume you meant
fn total_fold() -> u64 {
(1..64).fold(0u64, |r, s| r + square(s + 1))
}
fn total_map() -> u64 {
(1..64).map(|i| square(i + 1)).sum()
}
There are a few avenues to check:
The easiest source for the IR and assembly is one of the playgrounds (official or alternate). These both have buttons to view the assembly or IR. You can also pass --emit=llvm-ir
or --emit=asm
to the compiler to generate these files.
Make sure to generate assembly or IR in release mode. The attribute #[inline(never)]
is often useful to keep functions separate to find them easier in the output.
Benchmarking is documented in The Rust Programming Language, so there's no need to repeat all that valuable information.
Before Rust 1.14, these do not produce the exact same assembly. I'd wait for benchmarking / profiling data to see if there's any meaningful impact on performance before I worried.
As of Rust 1.14, they do produce the same assembly! This is one reason I love Rust. You can write clear and idiomatic code and smart people come along and make it equally as fast.
but the first code would iterate two times, once for the
map
and once for thesum
.
This is incorrect, and I'd love to know what source told you this so we can go correct it at that point and prevent future misunderstandings. An iterator operates on a pull basis; one element is processed at a time. The core method is next
, which yields a single value, running just enough computation to produce that value.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With