I run a command like this on my macbook, using GNU Parallel:
parallel "sample operation" ::: samplefolder/*.txt
The problem is that I have 20,000 txt files in the samplefolder
, which cause a Argument list too long
error.
And there's no such a problem when I tried run the same script on an ubuntu machine.
I tried googling and reading some man
files, but no luck. How can I solve this problem?
Thanks!
The Solution There are several solutions to this problem (bash: /usr/bin/rm: Argument list too long). Remove the folder itself, then recreate it. If you still need that directory, then recreate it with the mkdir command.
Try:
ls samplefolder | grep \.txt | parallel "sample operation samplefolder/{}"
Here's how you can deal with this on a typical UNIX box (I assume OSX has find
and xargs
too):
# find samplefolder -name \*.txt -print0 | xargs -P 8 -n 1 -0 sample operation
Find will print all .txt file names in samplefolder separated by a NUL character. xargs in turn will read this NUL-separated list (-0
) and for each N files (-n1
-- for each file in this case) will launch sample operation path/file.txt
with up to 8 (-P8
) of them in parallel.
Handle that operation in smaller batches using -N
, and pipe the input file list rather than giving it on the command line.
For example, expanding on ArtemB's answer, to process in batches of 16 files (warning, this will break with paths containing newlines):
find samplefolder -type f -name "*.txt" | parallel -N16 "sample operation" {}
To tailor the maximum number of arguments you can check getconf ARG_MAX
in your environment. For example:
# ~$> getconf ARG_MAX
2097152
given that paths on *nix can typically be 4096 characters, that leaves me free to put 2097152/4096=512 file paths on the command line (excluding the "sample operation" command itself of course).
So something like
find samplefolder -name "*.txt" | parallel -N500 "sample operation" {}
would let me process in batches of 500. Of course, depending on what tool you are running, you may want to experiment and optimize the batch size for speed.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With