I have the results of a numerical simulation that consist of hundreds of directories; each directory contains millions of text files.
I need to substitute a the string "wavelength;
" with "wavelength_bc;
" so I have tried both the following:
find . -type f -exec sed -i 's/wavelength;/wavelength_bc;/g' {} \;
and
find . -type f -exec sed -i 's/wavelength;/wavelength_bc;/g' {} +
Unfortunately, the commands above take a very long time to finish, (more than 1 hour).
I wonder how can I take advantage of the number of cores on my machine (8) to accelerate the command above?
I am thinking of using xargs
with -P
flag. I'm scared that that will corrupt the files; so I have no idea if that is safe or not?
In summary:
sed
substitutions when using with find
?xargs -P
to run that in parallel?Thank you
xargs -P
should be safe to use, however you will need to use -print0
option of find
and piping to xargs -0
to address filenames with spaces or wildcards:
find . -type f -print0 |
xargs -0 -I {} -P 0 sed -i 's/wavelength;/wavelength_bc;/g' {}
-P 0
option in xargs
will run in Parallel mode. It will run as many processes as possible for your CPU.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With