Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Promise.all() vs await

I'm trying to understand node.js single threaded architecture and the eventloop to make our application more efficient. So consider this scenario where I have to make several database calls for an http api call. I can do it using Promise.all() or using a separate await.

example:

Using async/await

await inserToTable1();
await insertToTable2();
await updateTable3();

Using Promise.all() I can do the same by

await Promise.all[inserToTable1(), insertToTable2(), updateTable3()]

Here for one API hit at a given time, Promise.all() will be quicker to return the response as it fires the DB calls in parallel. But, if I have 1000 API hits per second, will there be any difference? For this scenario, is Promise.all() better for the eventloop?

Update Assume the following, By 1000 API hits, I meant the overall traffic to the application. Consider there are 20-25 APIs. Out of these a few might do DB operations, a few might make a few http calls, etc. Also, at no point we will be reaching the DB pool max connections.

Thanks in advance!!

like image 664
Praveen Kumar Avatar asked Jul 12 '20 15:07

Praveen Kumar


2 Answers

As usual when it comes to system design, the answer is: it depends.

There are a lot of factors that determines the performance of either. In general, awaiting a single Promise.all() waits for all requests in parallel.

Event Loop

The event loop uses exactly 0% CPU time to wait for a request. See my answer to this related question for an explanation of how exactly the event loop works: Performance of NodeJS with large amount of callbacks

So from the event loop point of view there is no real difference between requesting sequentially and requesting in parallel with a Promise.all(). So if this is the core of your question I guess the answer is there is no difference between the two.

However, processing the callbacks does take CPU time. Again, the time to complete executing all the callbacks are the same. So from the point of view of CPU performance again there is no difference between the two.

Making requests in parallel does reduce overall execution time however. Firstly if the service is multithreaded you are essentially using it's multithreadedness by making parallel requests. This is what makes node.js fast even though it's single threaded.

Even if the service you are requesting from isn't multithreaded and actually handle requests sequentially, or if the server you're requesting from is a single core CPU (rare these days but you can still rent single-core virtual machines) then parallel requests reduces networking overhead since your OS can send multiple requests in a single Ethernet frame thus amortizing the overhead of packet headers over several requests. This does have a diminishing return beyond around half a dozen parallel requests however.

One Thousand Requests

You've hypothesized making 1000 requests. Weather or not awaiting 1000 promises in parallel actually causes parallel requests depends on how the API works at the network level.

Connection pools.

Lots of database libraries implement connection pools. That is, the library will open some number of connections to the database, for example 5, and reuse the connections.

In some implementation, making 1000 requests via such a library will cause the low-level networking code of the library to batch them 5 requests at a time. This means that at most you can have 5 parallel requests (assuming a pool of 5 connections). In this case it is perfectly safe to make 1000 parallel requests.

Some implementations however have a growable connection pool. In such implementations making 1000 parallel requests will cause your software to open 1000 sockets to access the remote resource. In such cases how safe it is to make 1000 parallel requests will depend on weather the remote server allows this.

Connection limit.

Most databases such as Mysql and Postgresql allows the admin to configure a connection limit, for example 5, such that the database will reject more than the limited number of connections per IP address. If you use a library that does not automatically manage maximum connections to your database then your database will accept the first 5 requests and reject the remaining until another slot is available (it's possible that a connection is freed before node.js finishes opening the 1000th socket). In this case you cannot successfully make 1000 parallel requests - you need to manage how many parallel requests you make.

Some API services also limit the number of connections you can make in parallel. Google Maps for example limits you to 500 requests per second. Therefore awaiting 1000 parallel requests will cause 50% of your requests to fail and possibly cause your API key or IP address to be banned.

Networking limits.

There is a theoretical limit on the number of sockets your machine or a server can open. However this number is extremely high so it's not worth discussing here.

However, all OSes that is currently in existence limit the maximum number of open sockets. On Linux (eg Ubuntu & Android) and Unix (eg MacOSX and iOS) sockets are implemented as file descriptors. And there is a maximum number of file descriptors allocated per process.

For Linux this number usually defaults to 1024 files. Note that a process opens 3 file descriptors by default: stdin, stdout and stderr. That leaves 1021 file descriptors shared by files and sockets. So your 1000 request in parallel skirts very close to this number and may fail if two clients try to make 1000 parallel requests at the same time.

This number can be increased but it does have a hard limit. The current maximum number of file descriptors you can configure on Linux is 590432. However this extreme configuration only works properly on a single user system with no daemons (or other background programs) running.

What to do?

The first rule when writing networking code is try not to break the network. Be reasonable in the number of requests you make at any one time. You can batch your requests to the limit of what the service expects.

With async/await it's easy. You can do something like this:

let parallel_requests = 10;

while (one_thousand_requests.length > 0) {
    let batch = [];

    for (let i=0;i<parallel_requests;i++) {
        let req = one_thousand_requests.pop();
        if (req) {
            batch.push(req());
        }
    }

    await Promise.all(batch);
}

Generally the more requests you can make in parallel the better (shorter) overall process time will be. I guess this is what you wanted to hear. But you need to balance parallelism with the factors above. 5 is generally OK. 10 maybe. 100 will depend on the server responding to the requests. 1000 or more and the admin who installed the server will probably have to tune his OS.

like image 92
slebetman Avatar answered Oct 01 '22 01:10

slebetman


await approach will suspend the function execution for every await call and execute them sequentially while Promise.all can execute things parallel (in async) and return success when all of them are successful.

So it's better to use Promise.all if your three (inserToTable1(), insertToTable2(), table3()) methods are independent.

The ability of javascript to execute other stuff while a heavy operations are happening by suspending is achieved through event loops and call stacks.

Event Loops

The decoupling of the caller from the response allows for the JavaScript runtime to do other things while waiting for your asynchronous operation to complete and their callbacks to fire.

JavaScript runtimes contain a message queue which stores a list of messages to be processed and their associated callback functions. These messages are queued in response to external events (such as a mouse being clicked or receiving the response to an HTTP request) given a callback function has been provided.

The Event Loop has one simple job — to monitor the Call Stack and the Callback Queue. If the Call Stack is empty, it will take the first event from the queue and will push it to the Call Stack, which effectively runs it.

like image 28
vizsatiz Avatar answered Oct 01 '22 02:10

vizsatiz