Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python - download entire directory from Google Cloud Storage

At the following page

https://googlecloudplatform.github.io/google-cloud-python/latest/storage/blobs.html

there are all the API calls which can be used for Python & Google Cloud storage. Even in the "official" samples on github

https://github.com/GoogleCloudPlatform/python-docs-samples/blob/master/storage/cloud-client/snippets.py

don't have a related example.

Finally, downloading a directory with the same method used for download files gives the error

Error:  [Errno 21] Is a directory:
like image 523
user1403546 Avatar asked Apr 10 '18 08:04

user1403546


2 Answers

You just have to first list all the files in a directory and then download them one by one:

bucket_name = 'your-bucket-name'
prefix = 'your-bucket-directory/'
dl_dir = 'your-local-directory/'

storage_client = storage.Client()
bucket = storage_client.get_bucket(bucket_name=bucket_name)
blobs = bucket.list_blobs(prefix=prefix)  # Get list of files
for blob in blobs:
    filename = blob.name.replace('/', '_') 
    blob.download_to_filename(dl_dir + filename)  # Download

blob.name includes the entire directory structure + filename, so if you want the same file name as in the bucket, you might want to extract it first (instead of replacing / with _)

like image 60
ksbg Avatar answered Oct 17 '22 14:10

ksbg


If you want to keep the same directory structure without renaming and also create nested folders. I have for python 3.5+ a solution based on @ksbg answer :

from pathlib import Path
bucket_name = 'your-bucket-name'
prefix = 'your-bucket-directory/'
dl_dir = 'your-local-directory/'

storage_client = storage.Client()
bucket = storage_client.get_bucket(bucket_name=bucket_name)
blobs = bucket.list_blobs(prefix=prefix)  # Get list of files
for blob in blobs:
    if blob.name.endswith("/"):
        continue
    file_split = blob.name.split("/")
    directory = "/".join(file_split[0:-1])
    Path(directory).mkdir(parents=True, exist_ok=True)
    blob.download_to_filename(blob.name) 
like image 27
Axel Avatar answered Oct 17 '22 15:10

Axel