Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to process files concurrently with bash?

Suppose I have 10K filesa and a bash script which processes a single file. Now I would like to process all these files concurrently with only K script running in parallel. I do not want (obviously) to process any file more than once.

How would you suggest implement it in bash ?

like image 827
Michael Avatar asked Dec 02 '22 19:12

Michael


1 Answers

One way of executing a limited number of parallel jobs is with GNU parallel. For example, with this command:

find . -type f -print0 | parallel -0 -P 3 ./myscript {1}

You will pass all files in the current directory (and its subdirectories) as parameters to myscript, one at a time. The -0 option sets the delimiter to be the null character, and the -P option sets the number of jobs that are executed in parallel. The default number of parallel processes is equal to the number of cores in the system. There are other options for parallel processing in clusters etc, which are documented here.

like image 111
user000001 Avatar answered Dec 20 '22 07:12

user000001