Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parallel version of Files.walkFileTree (java or scala)

Does anyone know of any parallel equivalent of java Files.walkFileTree or something similar? It can be Java or Scala library.

like image 754
matt Avatar asked Jul 18 '13 19:07

matt


1 Answers

As others have pointed out, walking a file tree is almost certainly IO bound instead of CPU bound so the benefits of doing a multithreaded file tree walk are questionable. But if you really wanted to, you could probably roll your own with a ForkJoinPool or similar.

import java.io.IOException;
import java.nio.file.FileVisitResult;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.nio.file.SimpleFileVisitor;
import java.nio.file.attribute.BasicFileAttributes;
import java.util.ArrayList;
import java.util.List;
import java.util.concurrent.ForkJoinPool;
import java.util.concurrent.RecursiveAction;

public class MultiThreadedFileTreeWalk {
    private static class RecursiveWalk extends RecursiveAction {
        private static final long serialVersionUID = 6913234076030245489L;
        private final Path dir;

        public RecursiveWalk(Path dir) {
            this.dir = dir;
        }

        @Override
        protected void compute() {
            final List<RecursiveWalk> walks = new ArrayList<>();
            try {
                Files.walkFileTree(dir, new SimpleFileVisitor<Path>() {
                    @Override
                    public FileVisitResult preVisitDirectory(Path dir, BasicFileAttributes attrs) throws IOException {
                        if (!dir.equals(RecursiveWalk.this.dir)) {
                            RecursiveWalk w = new RecursiveWalk(dir);
                            w.fork();
                            walks.add(w);

                            return FileVisitResult.SKIP_SUBTREE;
                        } else {
                            return FileVisitResult.CONTINUE;
                        }
                    }

                    @Override
                    public FileVisitResult visitFile(Path file, BasicFileAttributes attrs) throws IOException {
                        System.out.println(file + "\t" + Thread.currentThread());
                        return FileVisitResult.CONTINUE;
                    }
                });
            } catch (IOException e) {
                e.printStackTrace();
            }

            for (RecursiveWalk w : walks) {
                w.join();
            }
        }
    }

    public static void main(String[] args) throws IOException {
        RecursiveWalk w = new RecursiveWalk(Paths.get(".").toRealPath());
        ForkJoinPool p = new ForkJoinPool();
        p.invoke(w);
    }
}

This example walks each directory on a separate thread. Here's the tutorial for Java 7's fork/join library.

like image 163
Jeffrey Avatar answered Oct 21 '22 20:10

Jeffrey