I have an app that works great for processing files that land in a directory on my server. The process is:
1) check for files in a directory
2) queue a user work item to handle each file in the background
3) wait until all workers have completed
4) goto 1
This works nicely and I never worry about the same file being processed twice or multiple threads being spawned for the same file. However, if there's one file that takes too long to process, step #3 hangs on that one file and holds up all other processing.
So my question is, what's the correct paradigm to spawn exactly one thread for each file I need to process, while not blocking if one file takes too long? I considered FileSystemWatcher, but the files may not be immediately readable which is why I continually look at all files and spawn a process for each (which will immediately exit if the file is locked).
Should I remove step #3 and maintain a list of files I've already processed? That seems messy and the list would grow very large over time so I suspect there's a more elegant solution.
I would suggest that you maintain a list of files which you are currently processing. Have the thread remove itself from this list when the thread finishes. When looking for new files, exclude those in the currently-running list.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With