Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to copy a directory to google cloud storage using google cloud Python API?

The following function serves well for copying a single file to the google cloud storage.

#!/usr/bin/python3.5
import googleapiclient.discovery

from google.cloud import storage

def upload_blob(bucket_name, source_file_name, destination_blob_name, project):
  storage_client = storage.Client(project=project)
  bucket = storage_client.get_bucket(bucket_name)
  blob = bucket.blob(destination_blob_name)

blob.upload_from_filename(source_file_name)

print('File {} uploaded to {}.'.format(
    source_file_name,
    destination_blob_name))

Now instead of giving a filename, i tried inputting a directory name, upload_blob('mybucket','/data/inputdata/', 'myapp/inputdata/','myapp') but then i get this error:

AttributeError: 'str' object has no attribute 'read'

Do i need to give any additional parameters when calling the function blob.upload_from_file() to copy a directory?

like image 541
Vishnu P N Avatar asked Jan 30 '18 06:01

Vishnu P N


People also ask

How do I upload a folder to Google Cloud?

In the Google Cloud console, go to the Cloud Storage Buckets page. Navigate to the bucket. Click on Create folder to create an empty new folder, or Upload folder to upload an existing folder.


2 Answers

Here's some code you can use to accomplish this:

import os
import glob

def copy_local_directory_to_gcs(local_path, bucket, gcs_path):
    """Recursively copy a directory of files to GCS.

    local_path should be a directory and not have a trailing slash.
    """
    assert os.path.isdir(local_path)
    for local_file in glob.glob(local_path + '/**'):
        if not os.path.isfile(local_file):
            continue
        remote_path = os.path.join(gcs_path, local_file[1 + len(local_path) :])
        blob = bucket.blob(remote_path)
        blob.upload_from_filename(local_file)

Use it like so:

copy_local_directory_to_gcs('path/to/foo', bucket, 'remote/path/to/foo')

Where bucket is the usual object from the Google Cloud Storage API:

from google.cloud import storage    
client = storage.Client(project='your-project')
bucket = client.get_bucket('bucket-name')
like image 123
danvk Avatar answered Nov 08 '22 03:11

danvk


Uploading more than one file at a time is not a built-in feature of the API. You can either copy several files in a loop, or you can use the command-line utility instead, which can copy whole directories.

like image 7
Brandon Yarbrough Avatar answered Nov 08 '22 05:11

Brandon Yarbrough