In using Node.js to query some public APIs via HTTP requests. Therefore, I'm using the <code>request</code> module. I'm measuring the response time within my application, and see that my application return the results from API queries about 2-3 times slower than "direct" requests via curl or in the browser. Also, I noticed that connections to HTTPS enabled services usually take longer than plain HTTP ones, but this can be a coincidence. I tried to optimize my <code>request</code> options, but to no avail. For example, I query https://www.linkedin.com/countserv/count/share?url=http%3A%2F%2Fwww.google.com%2F&lang=en_US I'm using <code>request.defaults</code> to set the overall defaults for all requests: <pre class="prettyprint"><code>var baseRequest = request.defaults({ pool: {maxSockets: Infinity}, jar: true, json: true, timeout: 5000, gzip: true, headers: { 'Content-Type': 'application/json' } }); </code></pre> The actual request are done via <pre class="prettyprint"><code>... var start = new Date().getTime(); var options = { url: 'https://www.linkedin.com/countserv/count/share?url=http%3A%2F%2Fwww.google.com%2F&lang=en_US', method: 'GET' }; baseRequest(options, function(error, response, body) { if (error) { console.log(error); } else { console.log((new Date().getTime()-start) + ": " + response.statusCode); } }); </code></pre> Does anybody see optimization potential? Am I doing something completely wrong? Thanks in advance for any advice!

There are several potential issues you'll need to address given what I understand from your architecture. In no particular order they are: <ul> <li>Using <code>request</code> will always be slower than using <code>http</code> directly since as the wise man once said: "abstraction costs". ;) In fact, to squeeze out every possible ounce of performance, I'd handle all HTTP requests using node's <code>net</code> module directly. For HTTPS, it's not worth rewriting the <code>https</code> module. And for the record, HTTPS will always be slower than HTTP by definition due to both the need to handshake cryptographic keys and do the crypt/decrypt work on the payload.</li> <li>If your requirements include retrieving more than one resource from any single server, assure that those requests are made in order with the http KeepAlive set so you can benefit from the already open socket. The time it takes to handshake a new TCP socket is huge compared to making a request on an already open socket.</li> <li>assure that http connection pooling is disabled (see Nodejs Max Socket Pooling Settings)</li> <li>assure that your operating system and shell is not limiting the number of available sockets. See How many socket connections possible? for hints.</li> <li>if you're using linux, check Increasing the maximum number of tcp/ip connections in linux and I'd also strongly recommend fine tuning the kernel socket buffers.</li> </ul> I'll add more suggestions as they occur to me. <h3>Update</h3> More on the topic of multiple requests to the same endpoint: If you need to retrieve a number of resources from the same endpoint, it would be useful to segment your requests to specific workers that maintain open connections to that endpoint. In that way, you can be assured that you can get the requested resource as quickly as possible without the overhead of the initial TCP handshake. TCP handshake is a three-stage process. Step one: client sends a SYN packet to the remote server. Step two: the remote server replies to the client with a SYN+ACK. Step three: the client replies to the remote server with an ACK. Depending on the client's latency to the remote server, this can add up to (as William Proxmire once said) "real money", or in this case, delay. From my desktop, the current latency (round-trip time measure by ping) for a 2K octet packet to www.google.com is anywhere between 37 and 227ms. So assuming that we can rely on a round-trip mean of 95ms (over a perfect connection), the time for the initial TCP handshake would be around 130ms or SYN(45ms) + SYN+ACK(45ms) + ACK(45ms) and this is a tenth of a second just to establish the initial connection. If the connection requires retransmission, it could take much longer. And this is assuming you retrieve a single resource over a new TCP connection. To ameliorate this, I'd have your workers keep a pool of open connections to "known" destinations which they would then advertise back to the supervisor process so it could direct requests to the least loaded server with a "live" connection to the target server.

Delays in HTTP requests via Node.js compared to browser

Tags:

javascript

http

node.js

request

In using Node.js to query some public APIs via HTTP requests. Therefore, I'm using the request module. I'm measuring the response time within my application, and see that my application return the results from API queries about 2-3 times slower than "direct" requests via curl or in the browser. Also, I noticed that connections to HTTPS enabled services usually take longer than plain HTTP ones, but this can be a coincidence.

I tried to optimize my request options, but to no avail. For example, I query

https://www.linkedin.com/countserv/count/share?url=http%3A%2F%2Fwww.google.com%2F&lang=en_US

I'm using request.defaults to set the overall defaults for all requests:

var baseRequest = request.defaults({
    pool: {maxSockets: Infinity},
    jar: true,
    json: true,
    timeout: 5000,
    gzip: true,
    headers: {
        'Content-Type': 'application/json'
    }
});

The actual request are done via

...
var start = new Date().getTime();

var options = {
    url: 'https://www.linkedin.com/countserv/count/share?url=http%3A%2F%2Fwww.google.com%2F&lang=en_US',
    method: 'GET'
};

baseRequest(options, function(error, response, body) {

    if (error) {
        console.log(error);
    } else {
        console.log((new Date().getTime()-start) + ": " + response.statusCode);
    }

});

Does anybody see optimization potential? Am I doing something completely wrong? Thanks in advance for any advice!

467

asked Mar 06 '15 08:03

Tobi

1 Answers

There are several potential issues you'll need to address given what I understand from your architecture. In no particular order they are:

Using request will always be slower than using http directly since as the wise man once said: "abstraction costs". ;) In fact, to squeeze out every possible ounce of performance, I'd handle all HTTP requests using node's net module directly. For HTTPS, it's not worth rewriting the https module. And for the record, HTTPS will always be slower than HTTP by definition due to both the need to handshake cryptographic keys and do the crypt/decrypt work on the payload.
If your requirements include retrieving more than one resource from any single server, assure that those requests are made in order with the http KeepAlive set so you can benefit from the already open socket. The time it takes to handshake a new TCP socket is huge compared to making a request on an already open socket.
assure that http connection pooling is disabled (see Nodejs Max Socket Pooling Settings)
assure that your operating system and shell is not limiting the number of available sockets. See How many socket connections possible? for hints.
if you're using linux, check Increasing the maximum number of tcp/ip connections in linux and I'd also strongly recommend fine tuning the kernel socket buffers.

I'll add more suggestions as they occur to me.

Update

More on the topic of multiple requests to the same endpoint:

If you need to retrieve a number of resources from the same endpoint, it would be useful to segment your requests to specific workers that maintain open connections to that endpoint. In that way, you can be assured that you can get the requested resource as quickly as possible without the overhead of the initial TCP handshake.

TCP handshake is a three-stage process.

Step one: client sends a SYN packet to the remote server. Step two: the remote server replies to the client with a SYN+ACK. Step three: the client replies to the remote server with an ACK.

Depending on the client's latency to the remote server, this can add up to (as William Proxmire once said) "real money", or in this case, delay.

From my desktop, the current latency (round-trip time measure by ping) for a 2K octet packet to www.google.com is anywhere between 37 and 227ms.

So assuming that we can rely on a round-trip mean of 95ms (over a perfect connection), the time for the initial TCP handshake would be around 130ms or SYN(45ms) + SYN+ACK(45ms) + ACK(45ms) and this is a tenth of a second just to establish the initial connection.

If the connection requires retransmission, it could take much longer.

And this is assuming you retrieve a single resource over a new TCP connection.

To ameliorate this, I'd have your workers keep a pool of open connections to "known" destinations which they would then advertise back to the supervisor process so it could direct requests to the least loaded server with a "live" connection to the target server.

answered Sep 22 '22 06:09

Rob Raisch

Related questions
                            
                                Node.js Kue how to restart failed jobs
                            
                                Cross-domain will not work with a SignalR PersistentConnection
                            
                                How do I implement and use eyecon's bootstrap-datepicker?
                            
                                Use Bing Quadkey tiles instead of x/y/z tiles in leafletjs map
                            
                                Frontend Bug Reporting plugin for Website Beta Tester
                            
                                Spoof or disable the Page Visibility API [closed]
                            
                                QT: Javascript execution slow (unless I log to the console)
                            
                                Turn off cyclmatic complexity in JSHint
                            
                                How to perfectly sync two or more html5 video tags?
                            
                                html5 drag and drop FOR MOBILE [duplicate]
                            
                                AngularJs: multiple directives asking for isolated scope on
                            
                                Create a Web Worker from a Chrome Extension content script
                            
                                How to make text responsive to div size?
                            
                                document.querySelector() returns null
                            
                                How can I create/find in Mongoose?
                            
                                How to clone ES6 generator? [duplicate]
                            
                                Store date in MongoDB without considering the timezone
                            
                                Require jsx files without specifying extension
                            
                                define is not defined Javascript Node
                            
                                How jasmine clock works?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With