Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Cannot use Rayon's `.par_iter()`

I have a struct which implements Iterator and it works fine as an iterator. It produces values, and using .map(), I download each item from a local HTTP server and save the results. I now want to parallelize this operation, and Rayon looks friendly.

I am getting a compiler error when trying to follow the example in the documentation.

This is the code that works sequentially. generate_values returns the struct which implements Iterator. dl downloads the values and saves them (i.e. it has side effects). Since iterators are lazy in Rust, I have put a .count() at the end so that it will actually run it.

generate_values(14).map(|x| { dl(x, &path, &upstream_url); }).count();

Following the Rayon example I tried this:

generate_values(14).par_iter().map(|x| { dl(x, &path, &upstream_url); }).count();

and got the following error:

src/main.rs:69:27: 69:37 error: no method named `par_iter` found for type `MyIterator` in the current scope

Interestingly, when I use .iter(), which many Rust things use, I get a similar error:

src/main.rs:69:27: 69:33 error: no method named `iter` found for type `MyIterator` in the current scope
src/main.rs:69     generate_values(14).iter().map(|tile| { dl_tile(tile, &tc_path, &upstream_url); }).count();

Since I implement Iterator, I should get .iter() for free right? Is this why .par_iter() doesn't work?

Rust 1.6 and Rayon 0.3.1

$ rustc --version
rustc 1.6.0 (c30b771ad 2016-01-19)
like image 914
Amandasaurus Avatar asked Mar 08 '16 09:03

Amandasaurus


2 Answers

Rayon 0.3.1 defines par_iter as:

pub trait IntoParallelRefIterator<'data> {
    type Iter: ParallelIterator<Item=&'data Self::Item>;
    type Item: Sync + 'data;

    fn par_iter(&'data self) -> Self::Iter;
}

There is only one type that implements this trait in Rayon itself: [T]:

impl<'data, T: Sync + 'data> IntoParallelRefIterator<'data> for [T] {
    type Item = T;
    type Iter = SliceIter<'data, T>;

    fn par_iter(&'data self) -> Self::Iter {
        self.into_par_iter()
    }
}

That's why Lukas Kalbertodt's answer to collect to a Vec will work; Vec dereferences to a slice.

Generally, Rayon could not assume that any iterator would be amenable to parallelization, so it cannot default to including all Iterators.

Since you have defined generate_values, you could implement the appropriate Rayon trait for it as well:

  1. IntoParallelIterator
  2. IntoParallelRefIterator
  3. IntoParallelRefMutIterator

That should allow you to avoid collecting into a temporary vector.

like image 172
Shepmaster Avatar answered Nov 20 '22 21:11

Shepmaster


No, the Iterator trait has nothing to do with the iter() method. Yes, this is slightly confusing.

There are a few different concepts here. An Iterator is a type that can spit out values; it only needs to implement next() and has many other methods, but none of these is iter(). Then there is IntoIterator which says that a type can be transformed into an Iterator. This trait has the into_iter() method. Now the iter() method is not really related to any of those two traits. It's just a normal method of many types, that often works similar to into_iter().

Now to your Rayon problem: it looks like you can't just take any normal iterator and turn it into a parallel one. However, I never used this library, so takes this with a grain of salt. To me it looks like you need to collect your iterator into a Vec to be able to use par_iter().

And just as a note: when using normal iterators, you shouldn't use map() and count(), but rather use a standard for loop.

like image 29
Lukas Kalbertodt Avatar answered Nov 20 '22 20:11

Lukas Kalbertodt