Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I get a parallel stream of Files.walk?

I need to do some read-only processing on a all files in a folder recursively. I'm using Files.walk to get a stream of the files, but I noticed that the api specifies that walk only returns a regular stream, not a parallel stream.

How can I process all the files in a directory in parallel?

like image 535
David says Reinstate Monica Avatar asked Nov 08 '15 17:11

David says Reinstate Monica


2 Answers

I had the same issue. The Files.walk stream does not seem to work parallel. Ever afer transforming the stream into a parallel stream by invoking parallel() the processing was performed in one thread only.

The only solution was to transform collected the Paths in a list and create a parallel stream on this list as mentioned by Tagir Valeev.

Not working solution:

Files.walk(Paths.get(System.getProperty("user.dir")))
                    .parallel()
                    .filter(Files::isRegularFile)
                    ...

Working solution:

Files.walk(Paths.get(System.getProperty("user.dir")))
                    .collect(Collectors.toList())
                    .parallelStream()
                    .filter(Files::isRegularFile)
                    ...
like image 200
Oliver Avatar answered Oct 23 '22 11:10

Oliver


You can transform any Stream into a parallel Stream by invoking Stream::parallel.

Stream<Path> stream = Files.walk(startPath).parallel().forEach(...);
like image 39
Flown Avatar answered Oct 23 '22 11:10

Flown