Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how can i check if files exist for the given list of paths?

I have a list of 50,000 paths and I need to check if a file exists against each of these paths. Right now, I am verifying each path independently like this:

public static List<String> filesExist(String baseDirectory, Iterable<String> paths) throws FileNotFoundException{
        File directory = new File(baseDirectory);
        if(!directory.exists()){
            throw new FileNotFoundException("No Directory found: " + baseDirectory );
        }else{
            if(!directory.isDirectory())
                throw new FileNotFoundException(baseDirectory + " is not a directory!");
        }

        List<String> filesNotFound = new ArrayList<String>();

        for (String path : paths) {
            if(!new File(baseDirectory + path).isFile())
                filesNotFound.add(path);
        }
        return filesNotFound;
    }

Is there a way to improve it so that I don't create 50,000 File objects ? I am also using guava. Is there any utility in there which can help me with bulk exists() method ?

like image 533
brainydexter Avatar asked Jun 15 '12 07:06

brainydexter


People also ask

How do you check if a file exists in a particular path in Python?

isfile() method checks if a file exists in Python. os. path. isfile() returns True or False, depending on whether that file can be found.

How do you check the file is exist or not in Java?

To test to see if a file or directory exists, use the exists method of the Java File class, as shown in this example: File tmpDir = new File("/var/tmp"); boolean exists = tmpDir. exists(); The existing method of the Java File class returns true if the file or directory exists, and false otherwise.


1 Answers

The creation of 50,000 File objects is almost certainly not the bottleneck. The actual filesystem operations is probably what's making it slow.

I have two suggestions:

  1. Before checking, sort paths by their location to make best use of filesystem caches.
  2. If a sub-directory does not exist, you can automatically assume that all files and sub-directories therein don't exist either.
like image 95
NPE Avatar answered Sep 20 '22 19:09

NPE