Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Passing a List Iterator to multiple Threads in Java

I have a list that contains roughly 200K elements.

Am I able to pass the iterator for this list to multiple threads and have them iterate over the whole lot, without any of them accessing the same elements?

This is what I am thinking of at the moment.

Main:

public static void main(String[] args)
{
    // Imagine this list has the 200,000 elements.
    ArrayList<Integer> list = new ArrayList<Integer>();

    // Get the iterator for the list.
    Iterator<Integer> i = list.iterator();

    // Create MyThread, passing in the iterator for the list.
    MyThread threadOne = new MyThread(i);
    MyThread threadTwo = new MyThread(i);
    MyThread threadThree = new MyThread(i);

    // Start the threads.
    threadOne.start();
    threadTwo.start();
    threadThree.start();
}

MyThread:

public class MyThread extends Thread
{

    Iterator<Integer> i;

    public MyThread(Iterator<Integer> i)
    {
        this.i = i;
    }

    public void run()
    {
        while (this.i.hasNext()) {
            Integer num = this.i.next();
            // Do something with num here.
        }
    }
}

My desired outcome here is that each thread would process roughly 66,000 elements each, without locking up the iterator too much, and also without any of the threads accessing the same element.

Does this sound doable?

like image 378
Tom Wright Avatar asked Feb 05 '16 11:02

Tom Wright


3 Answers

Do you really need to manipulate threads and iterators manually? You could use Java 8 Streams and let parallel() do the job.

By default, it will use one less thread as you have processors.

Example :

list.stream()
    .parallel()
    .forEach(this::doSomething)
;

//For example, display the current integer and the current thread number.
public void doSomething(Integer i) {
  System.out.println(String.format("%d, %d", i, Thread.currentThread().getId()));
}

Result :

49748, 13
49749, 13
49750, 13
192710, 14
105734, 17
105735, 17
105736, 17
[...]

Edit : if you are using maven, you will need to add this piece of configuration in pom.xml in order to use Java 8 :

<build>
  <plugins>
    <plugin>
      <groupId>org.apache.maven.plugins</groupId>
      <artifactId>maven-compiler-plugin</artifactId>
      <version>3.3</version>
      <configuration>
        <source>1.8</source>
        <target>1.8</target>
      </configuration>
    </plugin>
  </plugins>
</build>
like image 94
Arnaud Denoyelle Avatar answered Sep 30 '22 20:09

Arnaud Denoyelle


You can't do it in a thread safe way with a single iterator. I suggest to use sublists:

List sub1 = list.subList(0, 100);
List sub2 = list.subList(100, 200);

ArrayList#subList() method will just wrap the given list without copying elements. Then you can iterate each subList in a different thread.

like image 37
AdamSkywalker Avatar answered Sep 30 '22 19:09

AdamSkywalker


Since next() method of the class that implements Iterator interface does data manipulation, concurrent usage of next() method needs synchronization. The synchronization can be accomplished using synchronized block on iterator object as follows:

synchronized(i)
{
    i.next();
}

Though, I recommend the usage of Stream API as in the answer above if your need is only parallel processing of the list.

like image 38
oak Avatar answered Sep 30 '22 21:09

oak