When recursively traversing through a directory structure, what is the most efficient algorithm to use if you have more files than directories? I notice that when using depth-first traversal, it seems to take longer when there are a lot of files in a given directory. Does breadth-first traversal work more efficiently in this case? I have no way to profile the two algorithms at the moment so your insights are very much welcome.
EDIT: In response to alphazero's comment, I'm using PHP on a Linux machine.
Answer. Answer: A tree structure is the most common directory structure. B+ tree is best for keeping track of directories and files in a computer.
Tree-structured directory – A tree structure is the most common directory structure. The tree has a root directory, and every file in the system has a unique path.
Tree-structured directoryThe directory is structured in the form of a tree. It also has a root directory, and every file in the system has a unique path. A directory within a tree-structured directory may contain files or subdirectories. Special system calls are used to create or remove directories.
Since you have more files than directories, it does not appear as if you are dealing with very deeply nested directories that would make DFS to take more memory (and hence somewhat more time) than BFS. Essentially, BFS and DFS both do the same thing (i.e. visit every node of the graph), and so in general their speeds should not differ by any significant amount.
It is difficult to say why exactly your DFS is slower without actually seeing your implementation. Are you sure you are not visiting the same nodes more than once due to links/shortcuts in your filesystem? You will also probably get a significant speedup if you use an explicit stack based DFS rather than recursion.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With