Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Multiple paginated GET API calls in parallel/async in Node

I am making call to the bitbucket API to get all the files that are in a repo. I have reached to a point where I can get the list of all the folders in the repo and make the first API call to all the root folders in the repo in parallel and get the the list of first 1000 files for all folders.

But the problem is bitbucket api can give me only 1000 files per folder at a time.

I need to append a query param &start =nextPageStart and make the call again, until it is null and isLastPage is true per API. How can I achieve that with below code??

I get the nextPageStart from first call to the api. See the API response below.

Below is the code that I have so far.

Any help or guidance is appreciated.

Response from individual API thats called per folder.

{
    "values": [
        "/src/js/abc.js",
        "/src/js/efg.js",
        "/src/js/ffg.js",
        ...
    ],
    "size": 1000,
    "isLastPage": false,
    "start": 0,
    "limit": 1000,
    "nextPageStart": 1000
}

function where i made asynchronous calls to get the list of files

export function getFilesList() {
  const foldersURL: any[] = [];
  getFoldersFromRepo().then((response) => {
    const values = response.values;
    values.forEach((value: any) => {
    //creating API URL for each folder in the repo
      const URL = 'https://bitbucket.abc.com/stash/rest/api/latest/projects/'
                   + value.project.key + '/repos/' + value.slug + '/files?limit=1000';
      foldersURL.push(URL);
        });
    return foldersURL;
      }).then((res) => {
    // console.log('Calling all the URLS in parallel');
    async.map(res, (link, callback) => {
       const options = {
         url: link,
         auth: {
           password: 'password',
           username: 'username',
         },
       };
       request(options, (error, response, body) => {

      // TODO: How do I make the get call again so that i can paginate and append the response to the body till the last page.

         callback(error, body);
       });
     }, (err, results) => {
       console.log('In err, results function');
       if (err) {
         return console.log(err);
       }
       //Consolidated results after all API calls.
       console.log('results', results);
     });
  })
   .catch((error) => error);
}
like image 655
Grinish Nepal Avatar asked Oct 16 '18 19:10

Grinish Nepal


People also ask

Which is the best way to trigger multiple API calls asynchronously?

If you want to call multiple API calls simultaneously, there's a better approach using Promise. all() . But if one API calls requires data from another, returning the fetch() method like this provides a simple, readable, flat structure and let's you use a single catch() for all of your API calls.

How do I make concurrent API calls in node?

Code for concurrent API call: query. tags); // make concurrent api calls const requests = tags. map((tag) => axios. get("https://api.hatchways.io/assessment/blog/posts?tag=" + tag) ); try { // wait until all the api calls resolves const result = await Promise.


2 Answers

I was able to get it working be creating a function with callback.

export function getFilesList() {
  const foldersURL: any[] = [];
  getFoldersFromRepo().then((response) => {
    const values = response.values;
    values.forEach((value: any) => {
    //creating API URL for each folder in the repo
      const URL = 'https://bitbucket.abc.com/stash/rest/api/latest/projects/'
                   + value.project.key + '/repos/' + value.slug + '/files?limit=1000';
      foldersURL.push(URL);
        });
    return foldersURL;
      }).then((res) => {
    // console.log('Calling all the URLS in parallel');
    async.map(res, (link, callback) => {
       const options = {
         url: link,
         auth: {
           password: 'password',
           username: 'username',
         },
       };
      const myarray = [];
// This function will consolidate response till the last Page per API.
      consolidatePaginatedResponse(options, link, myarray, callback);
     }, (err, results) => {
       console.log('In err, results function');
       if (err) {
         return console.log(err);
       }
       //Consolidated results after all API calls.
       console.log('results', results);
     });
  })
   .catch((error) => error);
}

function consolidatePaginatedResponse(options, link, myarray, callback) {
  request(options, (error, response, body) => {
    const content = JSON.parse(body);
    content.link = options.url;
    myarray.push(content);
    if (content.isLastPage === false) {
      options.url = link + '&start=' + content.nextPageStart;
      consolidatePaginatedResponse(options, link, myarray, callback);
    } else {
// Final response after consolidation per API
      callback(error, JSON.stringify(myarray));
    }
  });
}
like image 155
Grinish Nepal Avatar answered Oct 18 '22 20:10

Grinish Nepal


I think the best way is to wrap it in a old school for loop (forEach doesn't work with async, since it's synchronous and it will cause all the requests to be spawn at the same time).

What I understood is that you do some sort of booting query where you get the values array and then you should iterate among the pages. Here some code, I didn't fully grasp the APIs so I'll give a simplified (and hopefully readable) answer, you should be able to adapt it:

export async function getFilesList() {

    logger.info(`Fetching all the available values ...`);

    await getFoldersFromRepo().then( async values => {

        logger.info("... Folders values fetched.");

        for (let i = 0; ; i++ ) {

            logger.info( `Working on page ${i}`);

            try {
                // if you are using TypeScript, the result is not the promise but the succeeded value already
                const pageResult: PageResult = await yourPagePromise(i);
                if (pageResult.isLastPage) {
                    break;
                }
            } catch(err) {
                console.err(`Error on page ${i}`, err);
                break;
            }

        }

        logger.info("Done.");

    });

    logger.info(`All finished!`);

}

The logic behind is that first getFoldersFromRepo() returns a promise which returns the values, and then I sequentially iterate on all available pages through the yourPagePromise function (which returns a promise). The async/await construct allows to write more readable code, rather then having a waterfall of then().

I'm not sure it respects your APIs specs, but it's the logic you can use as foundation! ^^

like image 28
pierpytom Avatar answered Oct 18 '22 22:10

pierpytom