Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I read the contents of a new cloud storage file of type .json from within a cloud function?

The event passed to my Google cloud function only really tells me the name of the bucket and file, and whether the file was deleted. Yes, there's more there, but it doesn't seem all that useful:

{ timestamp: '2017-03-25T07:13:40.293Z', 
eventType: 'providers/cloud.storage/eventTypes/object.change', 
resource: 'projects/_/buckets/my-echo-bucket/objects/base.json#1490426020293545', 
data: { kind: 'storage#object', 
       resourceState: 'exists', 
       id: 'my-echo-bucket/base.json/1490426020293545', 
       selfLink: 'https://www.googleapis.com/storage/v1/b/my-echo-bucket/o/base.json', 
       name: 'base.json', 
       bucket: 'my-echo-bucket', 
       generation: '1490426020293545', 
       metageneration: '1', 
       contentType: 'application/json', 
       timeCreated: '2017-03-25T07:13:40.185Z', 
       updated: '2017-03-25T07:13:40.185Z', 
       storageClass: 'STANDARD', 
       size: '548', 
       md5Hash: 'YzE3ZjUyZjlkNDU5YWZiNDg2NWI0YTEyZWZhYzQyZjY=', 
       mediaLink: 'https://www.googleapis.com/storage/v1/b/my-echo-bucket/o/base.json?generation=1490426020293545&alt=media', contentLanguage: 'en', crc32c: 'BQDL9w==' } 
}

How do I get the contents and not merely the meta-data of a new .json file uploaded to a gs bucket?

I tried using npm:request() on event.data.selfLink, which is a URL for the file in the storage bucket, and got back an authorization error:

"code": 401, "message": "Anonymous users does not have storage.objects.get access to object my-echo-bucket/base.json."

There was a similar question on SO about reading storage buckets, but probably on a different platform. Anyway it was unanswered:

How do I read the contents of a file on Google Cloud Storage using javascript `

like image 232
Paul Avatar asked Mar 25 '17 08:03

Paul


Video Answer


2 Answers

You need to use a client library for google storage instead of accessing via the URL. Using request() against the URL would only work if the file was exposed to public access.

Import the google cloud storage library in the npm-managed directory containing your project.

npm i @google-cloud/storage -S

The npm page for google-cloud/storage has decent examples but I had to read through the API a bit to see an easy way to download to memory.

Within the Google Cloud Functions environment, you do not need to supply any api key, etc. to storage as initialization.

const storage = require('@google-cloud/storage')();

The metadata passed about the file can be used to determine if you really want the file or not.

When you want the file, you can download it with the file.download function, which can take either a callback or, lacking a callback, will return a promise.
The data however, is returned as a Buffer so you will need to call data.toString('utf-8') to convert it to a utf-8 encoded string.

const storage = require('@google-cloud/storage')();

exports.logNewJSONFiles = function logNewJSONFiles(event){
    return new Promise(function(resolve, reject){
        const file = event.data;
        if (!file){
            console.log("not a file event");
            return resolve();
        }
        if (file.resourceState === 'not_exists'){
            console.log("file deletion event");
            return resolve();
        }
        if (file.contentType !== 'application/json'){
            console.log("not a json file");
            return resolve();
        }
        if (!file.bucket){
            console.log("bucket not provided");
            return resolve();
        }
        if (!file.name){
            console.log("file name not provided");
            return resolve();
        }
        (storage
         .bucket(file.bucket)
         .file(file.name)
         .download()
         .then(function(data){
             if (data)
                 return data.toString('utf-8');
         })
         .then(function(data){
             if (data) {
                 console.log("new file "+file.name);
                 console.log(data);
                 resolve(data);
             }
         })
         .catch(function(e){ reject(e); })
             );
    });
};

Deployment is as expected:

gcloud beta functions deploy logNewJSONFiles --stage-bucket gs://my-stage-bucket --trigger-bucket gs://my-echo-bucket

Remember to look in the Stackdriver:Logging page on Google Cloud Platform for the console.log entries.

UPDATE: (2019) When cloud-functions first released, they had some issues with ECONNRESET. I think that's fixed now. If not, use something like npm:promise-retry

like image 121
Paul Avatar answered Sep 20 '22 21:09

Paul


npm install @google-cloud/storage --production

package.json:

{
  "main": "app.js",
  "dependencies": {
    "@google-cloud/storage": "^1.2.1"
  }
}

You should achieve that npm ls shows no errors like npm ERR! missing:.

app.js:

...

  const storage = require("@google-cloud/storage")();
  storage.
    bucket("mybucket").
    file("myfile.txt").
    download( function(err, contents) {
      console.log(contents.toString());
    } );
like image 39
Nakilon Avatar answered Sep 18 '22 21:09

Nakilon