Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Promise Resolving before Google Cloud Bucket Upload

I am writing some code that loops over a CSV and creates a JSON file based on the CSV. Included in the JSON is an array named photos, which is to contain the returned urls for the images that are being uploaded to Google Cloud Storage within the function. However, having the promise wait for the uploads to finish has me stumped, since everything is running asynchronously, and finishes off the promise and the JSON compilation prior to finishing the bucket upload and returning the url. How can I make the promise resolve after the urls have been retrieved and added to currentJSON.photos?

const csv=require('csvtojson')
const fs = require('fs');
const {Storage} = require('@google-cloud/storage');
var serviceAccount = require("./my-firebase-storage-spot.json");
const testFolder = './Images/';
var csvFilePath = './Inventory.csv';

var dirArr = ['./Images/Subdirectory-A','./Images/Subdirectory-B','./Images/Subdirectory-C'];
var allData = [];

csv()
.fromFile(csvFilePath)
.subscribe((json)=>{
  return new Promise((resolve,reject)=>{
    for (var i in dirArr ) {
      if (json['Name'] == dirArr[i]) {

        var currentJSON = {
          "photos" : [],
        };         

        fs.readdir(testFolder+json['Name'], (err, files) => {
          files.forEach(file => {
            if (file.match(/.(jpg|jpeg|png|gif)$/i)){
              var imgName = testFolder + json['Name'] + '/' + file;
              bucket.upload(imgName, function (err, file) {
                if (err) throw new Error(err);
                //returned uploaded img address is found at file.metadata.mediaLink
                currentJSON.photos.push(file.metadata.mediaLink);
              });              
            }else {
              //do nothing
            }
          });
        });
        allData.push(currentJSON);
      }
    }

    resolve(); 
  })
},onError,onComplete);

function onError() {
  // console.log(err)
}
function onComplete() {
  console.log('finito');
}

I've tried moving the resolve() around, and also tried placing the uploader section into the onComplete() function (which created new promise-based issues).

like image 393
maudulus Avatar asked Feb 27 '20 15:02

maudulus


People also ask

How do I upload a bucket file to GCP?

In the Google Cloud console, go to the Cloud Storage Buckets page. In the list of buckets, click on the name of the bucket that you want to upload an object to. In the Objects tab for the bucket, either: Drag and drop the desired files from your desktop or file manager to the main pane in the Google Cloud console.

What is Gsutil URI?

gsutil is a Python application that lets you access Cloud Storage from the command line. You can use gsutil to do a wide range of bucket and object management tasks, including: Creating and deleting buckets. Uploading, downloading, and deleting objects. Listing buckets and objects.


2 Answers

Indeed, your code is not awaiting the asynchronous invocation of the readdir callback function, nor of the bucket.upload callback function.

Asynchronous coding becomes easier when you use the promise-version of these functions.

bucket.upload will return a promise when omitting the callback function, so that is easy.

For readdir to return a promise, you need to use the fs Promise API: then you can use the promise-based readdir method and use promises throughout your code.

So use fs = require('fs').promises instead of fs = require('fs')

With that preparation, your code can be transformed into this:

const testFolder = './Images/';
var csvFilePath = './Inventory.csv';
var dirArr = ['./Images/Subdirectory-A','./Images/Subdirectory-B','./Images/Subdirectory-C'];

(async function () {
    let arr = await csv().fromFile(csvFilePath);
    arr = arr.filter(obj => dirArr.includes(obj.Name));
    let allData = await Promise.all(arr.map(async obj => {
        let files = await fs.readdir(testFolder + obj.Name);
        files = files.filter(file => file.match(/\.(jpg|jpeg|png|gif)$/i));
        let photos = await Promise.all(
            files.map(async file => {
                var imgName = testFolder + obj.Name + '/' + file;
                let result = await bucket.upload(imgName);
                return result.metadata.mediaLink;
            })
        );
        return {photos};
    }));
    console.log('finito', allData);
})().catch(err => {  // <-- The above async function runs immediately and returns a promise
    console.log(err);
});

Some remarks:

  • There is a shortcoming in your regular expression. You intended to match a literal dot, but you did not escape it (fixed in above code).

  • allData will contain an array of { photos: [......] } objects, and I wonder why you would not want all photo elements to be part of one single array. However, I kept your logic, so the above will still produce them in these chunks. Possibly, you intended to have other properties (next to photos) as well, which would make it actually useful to have these separate objects.

like image 91
trincot Avatar answered Sep 22 '22 08:09

trincot


The problem is the your code is not waiting in your forEach. I would highly recommend to look for stream and try to do things in parallel as much as possible. There is one library which is very powerful and does that job for you. The library is etl.

You can read rows from csv in parallel and process them in parallel rather than one by one.

I have tried to explain the lines in the code below. Hopefully it makes sense.

const etl = require("etl");
const fs = require("fs");
const csvFilePath = `${__dirname }/Inventory.csv`;
const testFolder = "./Images/";

const dirArr = [
  "./Images/Subdirectory-A",
  "./Images/Subdirectory-B",
  "./Images/Subdirectory-C"
];

fs.createReadStream(csvFilePath)
  .pipe(etl.csv()) // parse the csv file
  .pipe(etl.collect(10)) // this could be any value depending on how many you want to do in parallel.
  .pipe(etl.map(async items => {
    return Promise.all(items.map(async item => { // Iterate through 10 items
      const finalResult = await Promise.all(dirArr.filter(i => i === item.Name).map(async () => { // filter the matching one and iterate
        const files = await fs.promises.readdir(testFolder + item.Name); // read all files
        const filteredFiles = files.filter(file => file.match(/\.(jpg|jpeg|png|gif)$/i)); // filter out only images
        const result = await Promise.all(filteredFiles).map(async file => {
          const imgName = `${testFolder}${item.Name}/${file}`;
          const bucketUploadResult = await bucket.upload(imgName); // upload image
          return bucketUploadResult.metadata.mediaLink;
        });
        return result; // This contains all the media link for matching files
      }));
      // eslint-disable-next-line no-console
      console.log(finalResult); // Return arrays of media links for files
      return finalResult;
    }));
  }))
  .promise()
  .then(() => console.log("finsihed"))
  .catch(err => console.error(err));

like image 43
Ashish Modi Avatar answered Sep 21 '22 08:09

Ashish Modi