Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java multithreaded file system tree traversal

I have a little task for my interview (sic!).
I need to create java CLI programm that searchs for specific files matched some pattern. They said I need to use multi-threading approach without using util.concurrent package and to provide good performance on parallel controllers.

From my point of view it's pretty simple - I can create specific thread for each subfolder and run over it's contents, and for each subfolder run another Thread...
But it can't be so much easy :) Maybe someone can suggest me typical pitfalls that I shoud be avare of. Or any advice of correct way to implement this in Java will be highly appreciated.

Thanks!

UPD1
File name should match pattern.

like image 633
Igor Konoplyanko Avatar asked Dec 09 '25 14:12

Igor Konoplyanko


1 Answers

The problem would be to find a good number of threads and distributing works as equally as possbible.

Assuming you don't know the number of files and subdirectories in each folder, that could become quite tricky.

Here's an idea for a start:

What you might do is create a number of threads that operate on a central folder list and spawn a thread per folder you encounter up to a certain maximum. Each thread could then put the subfolders of the directory it works on to the central list and when it is done it might pick the next from that list.

If a folder is put on the list and the maximum number of threads is not reached, a new thread is spawned immediately.

If the thread has run out of work and the folder list is empty, it could either stop (requiring you to spawn a new one if needed) or wait until either there's a folder on the list or the application signals that all folders are processed.

Finally, don't forget to synchronize on the folder list.

Hope that helps you to get started.


Edit: (don't take the following too seriously :) )

You could also use another thread pool implementation that doesn't use the java.util.concurrent package :)

Edit 2: Basically what I described above is a simple and task specific thread pool implementation. You might try and look for more information on building a thread pool yourself (in the context of your assignment a thread pool task would be scanning one folder).

like image 77
Thomas Avatar answered Dec 12 '25 04:12

Thomas



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!