Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Iterate large set of files in a directory

Tags:

java

file

nio

I have a directory with 100,000 files and I need to iterate them all to read a value. Right now I use listFiles() to load all files in a array and then iterate one by one. But is there a memory efficient way to do this without loading in a array?

File[] tFiles = new File(Dir).listFiles();

try {
    for (final File tFile : tFiles) {
        //Process files one by one
    }
}
like image 848
meteor Avatar asked Nov 06 '15 15:11

meteor


3 Answers

Since Java 7, you can use the file visitor pattern to visit the contents of a directory recursively.

The documentation for the FileVisitor interface is here.

This allows you to iterate over files without creating a large array of File objects.

Simple example to print out your file names:

Path start = Paths.get(new URI("file:///my/folder/"));

Files.walkFileTree(start, new SimpleFileVisitor<Path>() {
     @Override
     public FileVisitResult visitFile(Path file, BasicFileAttributes attrs) 
        throws IOException
     {
         System.out.println(file);
         return FileVisitResult.CONTINUE;
     }
     @Override
     public FileVisitResult postVisitDirectory(Path dir, IOException e)
        throws IOException
     {
         if (e == null) {
             System.out.println(dir);
             return FileVisitResult.CONTINUE;
         } 
         else {
             // directory iteration failed
             throw e;
         }
     }
 });
like image 126
Mena Avatar answered Nov 19 '22 20:11

Mena


Java 8 lazily loaded stream version:

Files.list(new File("path to directory").toPath()).forEach(path -> {
    File file = path.toFile();
    //process your file
});
like image 22
Andrei Amarfii Avatar answered Nov 19 '22 20:11

Andrei Amarfii


If you want to avoid the excessive boilerplate that comes with JDK's FileVisitor, you can use Guava. Files.fileTreeTraverser() gives you a TreeTraverser<File> which you can use for traversing the files in the folder (or even sub-folders):

for (File f : Files.fileTreeTraverser()
                    .preOrderTraversal(new File("/parent/folder"))) {
    // do something with each file
}
like image 2
Mick Mnemonic Avatar answered Nov 19 '22 21:11

Mick Mnemonic