Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I append data to a file on google cloud storage

I am creating a CSV file on google cloud storage using google cloud function. Now I want to edit that file - is it possible to append data in that file? If yes, then how?

like image 426
Ansar Ahmed Avatar asked Nov 20 '19 05:11

Ansar Ahmed


People also ask

How do I add files to Google Cloud Storage?

In the Google Cloud console, go to the Cloud Storage Browser page. In the list of buckets, click on the name of the bucket that you want to upload an object to. In the Objects tab for the bucket, either: Drag and drop the desired files from your desktop or file manager to the main pane in the console.

How do I import a CSV file into Google Cloud Storage?

Select All Settings > Raw Data Export > CSV Upload. Select Google Cloud Storage from the dropdown menu. Upload your Service Account Key credential file. This is the JSON file created in the Google Cloud Console. Enter your Google Cloud Storage bucket name.

How do I upload files to Google Cloud Storage?

In the Google Cloud Console, go to the Cloud Storage Browser page. In the list of buckets, click on the name of the bucket that you want to upload an object to. Drag and drop the desired files from your desktop or file manager to the main pane in the Cloud Console.

How do I concatenate a bunch of files on Google Cloud Storage?

The naive way to concatenate a bunch of files on GCS is to download them to a VM, concatenate them using Unix `cat` and upload them. Don’t do that! Google Cloud Storage supports a nifty feature called “compose”: it lets you compose a blob out of up to 32 source blobs.

What is Google Cloud Storage?

Google Cloud Storage is the Object Storage managed service for Google Cloud Platform. Unlike a block storage or file system storage, objects stored are immutable. Objects are immutable, which means that an uploaded object cannot change throughout its storage lifetime.

How to upload an object to a Google Cloud bucket?

In the Google Cloud Console, go to the Cloud Storage Browser page. In the list of buckets, click on the name of the bucket that you want to upload an object to.


3 Answers

Google Cloud Storage is the Object Storage managed service for Google Cloud Platform. Unlike a block storage or file system storage, objects stored are immutable.

As mentioned in official doc :

Objects are immutable, which means that an uploaded object cannot change throughout its storage lifetime. An object's storage lifetime is the time between successful object creation (upload) and successful object deletion. In practice, this means that you cannot make incremental changes to objects, such as append operations or truncate operations. However, it is possible to overwrite objects that are stored in Cloud Storage, and doing so happens atomically — until the new upload completes the old version of the object will be served to readers, and after the upload completes the new version of the object will be served to readers. So a single overwrite operation simply marks the end of one immutable object's lifetime and the beginning of a new immutable object's lifetime.

As a workaround, we can consider to upload multiples files to a bucket, and then create a new object by composing all previous ones.

gsutil compose gs://bucket/obj1 [gs://bucket/obj2 ...] gs://bucket/composite

Note that this compose command is also available via JSON API :

POST https://storage.googleapis.com/storage/v1/b/bucket/o/destinationObject/compose

And via Cloud Storage Client Libraries

So this call could be easily integrated into your code. Be sure to grant before needed role to access to bucket.

Check official documentation

like image 142
Thierry Falvo Avatar answered Oct 22 '22 05:10

Thierry Falvo


I'm using this python script to append data to a csv files. This script will download the file, append the data and uploadit again to the same file in your bucket. You can implement this easily in your Cloud Function.

import csv
from google.cloud import storage

client = storage.Client()
bucket = client.get_bucket('thehotbucket')
blob = bucket.get_blob('data1.csv')
blob.download_to_filename('data1.csv')
fields = ['first', 'second', 'third']
with open(r'data1.csv', 'a') as f:
    writer = csv.writer(f)
    writer.writerow(fields)

blob = bucket.blob("data1.csv")
blob.upload_from_filename("data1.csv")

If you only want to merge files you can use the gsutil command

gsutil compose gs://bucket/obj1 [gs://bucket/obj2 ...] gs://bucket/obj1
like image 42
Chris32 Avatar answered Oct 22 '22 05:10

Chris32


GCS is an Object Storage and dont allow to update/edit a file once pushed to a GCS bucket.

The only way to update a file which lives in a GCS bucket is to download the file --> Make required changes and then push back to GCS bucket. This will overwrite the file with new content.

like image 33
pradeep Avatar answered Oct 22 '22 05:10

pradeep