Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Mass rename objects on Google Cloud Storage

Is it possible to mass rename objects on Google Cloud Storage using gsutil (or some other tool)? I am trying to figure out a way to rename a bunch of images from *.JPG to *.jpg.

like image 881
joshhunt Avatar asked Nov 27 '14 08:11

joshhunt


2 Answers

Here is a native way to do this in bash with an explanation below, line by line of the code:

gsutil ls gs://bucket_name/*.JPG > src-rename-list.txt
sed 's/\.JPG/\.jpg/g' src-rename-list.txt > dest-rename-list.txt
paste -d ' ' src-rename-list.txt dest-rename-list.txt | sed -e 's/^/gsutil\ mv\ /' | while read line; do bash -c "$line"; done
rm src-rename-list.txt; rm dest-rename-list.txt

The solution pushes 2 lists, one for the source and one for the destination file (to be used in the "gsutil mv" command):

gsutil ls gs://bucket_name/*.JPG > src-rename-list.txt
sed 's/\.JPG/\.jpg/g' src-rename-list.txt > dest-rename-list.txt

The line "gsutil mv " and the two files are concatenated line by line using the below code:

paste -d ' ' src-rename-list.txt dest-rename-list.txt | sed -e 's/^/gsutil\ mv\ /'

This then runs each line in a while loop: while read line; do bash -c "$line"; done

Lastly, clean up and delete the files created:

rm src-rename-list.txt; rm dest-rename-list.txt

The above has been tested against a working Google Storage bucket.

like image 194
beetlejuice Avatar answered Oct 18 '22 01:10

beetlejuice


https://cloud.google.com/storage/docs/gsutil/addlhelp/WildcardNames

gsutil supports URI wildcards

EDIT

gsutil 3.0 release note

As part of the bucket sub-directory support we changed the * wildcard to match only up to directory boundaries, and introduced the new ** wildcard...

Do you have directories under bucket? if so, maybe you need to go down to each directories or use **.

gsutil -m mv gs://my_bucket/**.JPG gs://my_bucket/**.jpg

or

gsutil -m mv gs://my_bucket/mydir/*.JPG gs://my_bucket/mydir/*.jpg

EDIT
gsutil doesn't support wildcard for destination so far (as of 4/12/'14)
nether API.

so at this moment you need to retrieve list of all JPG files, and rename each files.

python example:

import subprocess
files = subprocess.check_output("gsutil ls gs://my_bucket/*.JPG",shell=True)
files = files.split("\n")[:-1]
for f in files:
    subprocess.call("gsutil mv %s %s"%(f,f[:-3]+"jpg"),shell=True)

please note that this would take hours.

like image 4
HayatoY Avatar answered Oct 18 '22 00:10

HayatoY