Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to move files in Google Cloud Storage from one bucket to another bucket by Python

Are there any API function that allow us to move files in Google Cloud Storage from one bucket in another bucket?

The scenario is we want Python to move read files in A bucket to B bucket. I knew that gsutil could do that but not sure Python can support that or not.

Thanks.

like image 872
user3769827 Avatar asked Sep 22 '14 10:09

user3769827


People also ask

How do I transfer data from one bucket to another bucket in Google Cloud?

Open The Web console Storage > Tranfer to create a new transfer. Select the source bucket you want to copy from.

When transferring data from one Google Cloud Storage bucket to another you may incur?

Moving data between locations incurs network usage costs. In addition, moving data between buckets may incur retrieval and early deletion fees, if the data being moved are Nearline storage, Coldline storage, or Archive storage objects.


3 Answers

Here's a function I use when moving blobs between directories within the same bucket or to a different bucket.

from google.cloud import storage
import os
    
    os.environ["GOOGLE_APPLICATION_CREDENTIALS"]="path_to_your_creds.json"

def mv_blob(bucket_name, blob_name, new_bucket_name, new_blob_name):
    """
    Function for moving files between directories or buckets. it will use GCP's copy 
    function then delete the blob from the old location.
    
    inputs
    -----
    bucket_name: name of bucket
    blob_name: str, name of file 
        ex. 'data/some_location/file_name'
    new_bucket_name: name of bucket (can be same as original if we're just moving around directories)
    new_blob_name: str, name of file in new directory in target bucket 
        ex. 'data/destination/file_name'
    """
    storage_client = storage.Client()
    source_bucket = storage_client.get_bucket(bucket_name)
    source_blob = source_bucket.blob(blob_name)
    destination_bucket = storage_client.get_bucket(new_bucket_name)

    # copy to new destination
    new_blob = source_bucket.copy_blob(
        source_blob, destination_bucket, new_blob_name)
    # delete in old destination
    source_blob.delete()
    
    print(f'File moved from {source_blob} to {new_blob_name}')
like image 113
dmlee8 Avatar answered Sep 21 '22 13:09

dmlee8


Using the google-api-python-client, there is an example on the storage.objects.copy page. After you copy, you can delete the source with storage.objects.delete.

destination_object_resource = {}
req = client.objects().copy(
        sourceBucket=bucket1,
        sourceObject=old_object,
        destinationBucket=bucket2,
        destinationObject=new_object,
        body=destination_object_resource)
resp = req.execute()
print json.dumps(resp, indent=2)

client.objects().delete(
        bucket=bucket1,
        object=old_object).execute()
like image 24
jterrace Avatar answered Sep 20 '22 13:09

jterrace


you can use GCS Client Library Functions documented at [1] to read to one bucket and write to the other and then delete source file.

You can even use the GCS REST API documented at [2].

Link:
[1] - https://developers.google.com/appengine/docs/python/googlecloudstorageclient/functions
[2] - https://developers.google.com/storage/docs/concepts-techniques#overview

like image 30
Paolo P. Avatar answered Sep 22 '22 13:09

Paolo P.