Is there any advantage to using gsutil or the google cloud storage API in production transfers?

Question

Which is better to use with production transfers, gsutil, or the google cloud storage API?

Brandon Yarbrough · Accepted Answer

gsutil uses a Google Cloud Storage API to transfer data, specifically the JSON API (by default, you can change it). Its main advantage over using the API directly is that it has been tuned to transfer data quickly. For example, it can open up multiple simultaneous connections to GCS, each of which is uploading or downloading part of the file concurrently, which in many cases can provide a significant boost to total throughput.

There's no reason that programming against the API directly could not also provide the same or even better performance, but I would expect gsutil to be at least a little bit faster on average if you implement things in the simplest possible manner.

Paul · Answer

I'm not sure this is adding much over what Brandon has said. I'm very new to gcloud storage and Python, but I've quickly found that I prefer to use the gsutil command line over the python client library whereever possible. I create compute instances that copy a few GB of input data from cloud storage after they have booted. I found that its both neater and faster to do this using the gsutil command line where possible, so in my python code I use:

import subprocess
subprocess.call("gsutil -m cp gs://my-uberdata-archive/* /home/<username>/rawdata/", shell=True)

The main reasons being that I can do the command in a single line whereas it takes several lines using the client library, and as Brandon points out, gsutil supports multi-threading with the '-m' flag. I haven't found an equivalent way to do this with the Python Client library yet.

Is there any advantage to using gsutil or the google cloud storage API in production transfers?

Tags:

google-cloud-storage

gsutil

mohawkTrail

2 Answers

Brandon Yarbrough

Paul

Recent Activity

Donate For Us

Is there any advantage to using gsutil or the google cloud storage API in production transfers?

Tags:

google-cloud-storage

gsutil

mohawkTrail

2 Answers

Brandon Yarbrough

Paul

Related questions

Recent Activity

Donate For Us