Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

gsutil: Argument list too long

I am trying to upload many thousands of files to Google Cloud Storage, with the following command:

gsutil -m cp *.json gs://mybucket/mydir

But I get this error:

-bash: Argument list too long

What is the best way to handle this? I can obviously write a bash script to iterate over different numbers:

gsutil -m cp 92*.json gs://mybucket/mydir
gsutil -m cp 93*.json gs://mybucket/mydir
gsutil -m cp ...*.json gs://mybucket/mydir

But the problem is that I don't know in advance what my filenames are going to be, so writing that command isn't trivial.

Is there either a way to handle this with gsutil natively (I don't think so, from the documentation), or a way to handle this in bash where I can list say 10,000 files at a time, then pipe them to the gsutil command?

like image 829
Richard Avatar asked Nov 28 '22 00:11

Richard


2 Answers

Eric's answer should work, but another option would be to rely on gsutil's built-in wildcarding, by quoting the wildcard expression:

gsutil -m cp "*.json" gs://mybucket/mydir

To explain more: The "Argument list too long" error is coming from the shell, which has a limited size buffer for expanded wildcards. By quoting the wildcard you prevent the shell from expanding the wildcard and instead the shell passes that literal string to gsutil. gsutil then expands the wildcard in a streaming fashion, i.e., expanding it while performing the operations, so it never needs to buffer an unbounded amount of expanded text. As a result you can use gsutil wildcards over arbitrarily large expressions. The same is true when using gsutil wildcards over object names, so for example this would work:

gsutil -m cp "gs://my-bucket1/*" gs://my-bucket2

even if there are a billion objects at the top-level of gs://my-bucket1.

like image 77
Mike Schwartz Avatar answered Dec 19 '22 00:12

Mike Schwartz


If your filenames are safe from newlines you could use gsutil cp's ability to read from stdin like

find . -maxdepth 1 -type f -name '*.json' | gsutil -m cp -I gs://mybucket/mydir

or if you're not sure if your names are safe and your find and xargs support it you could do

find . -maxdepth 1 -type f -name '*.json' -print0 | xargs -0 -I {} gsutil -m cp {} gs://mybucket/mydir
like image 28
Eric Renouf Avatar answered Dec 18 '22 23:12

Eric Renouf