Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What are the atomic guarantees on copying a folder in GCS?

I have a GCS bucket containing a directory my-bucket-name/my-temp-dir-name. This directory contains many subfiles. I would like to execute a copy command, e.g. gsutil cp gs://my-bucket-name/my-temp-dir-name gs://my-bucket-name/my-dir-name.

Are there any atomic guarantees around this operation? Is it possible that some files will be accessible in my-dir-name before all the files are available? What if my-dir-name already exists?

like image 519
Max Avatar asked Oct 31 '25 03:10

Max


2 Answers

Individual object copies are atomic, but GCS does not support atomicity of copies across multiple objects.

like image 155
Mike Schwartz Avatar answered Nov 02 '25 23:11

Mike Schwartz


your-dir-name must exist before copying, otherwise the cp operation will result in a 404 when trying to locate the bucket.

Objects are copied independently (one at a time, or in parallel, depending on whether you super-power gsutil with the -m flag). Therefore, files will start to appear in your-dir-name as soon as they make it up to the cloud.

Note that objects in GCS are immutable, and operations are atomic at the object level. This means that the latest uploaded object wins: replaces the previous one(s). If you are interested in keeping previous version, you can enable versioning, and a N number of copies will be kept.

Bonus: If you are copying multiple files at once, use the -m flag to upload more than one object at the same time, like so:

gsutil -m cp -r gs://my-bucket-name/my-temp-dir-name gs://my-bucket-name/my-dir-name
or
gsutil -m cp gs://my-bucket-name/my-temp-dir-name/* gs://my-bucket-name/my-dir-name

like image 36
Jose L Ugia Avatar answered Nov 02 '25 22:11

Jose L Ugia