Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

connect EADDRNOTAVAIL in nodejs under high load - how to faster free or reuse TCP ports?

I have a small wiki-like web application based on the express-framework which uses elastic search as it's back-end. For each request it basically only goes to the elastic search DB, retrieves the object and returns it rendered with by the handlebars template engine. The communication with elastic search is over HTTP

This works great as long as I have only one node-js instance running. After I updated my code to use the cluster (as described in the nodejs-documentation I started to encounter the following error: connect EADDRNOTAVAIL

This error shows up when I have 3 and more python scripts running which constantly retrieve some URL from my server. With 3 scripts I can retrieve ~45,000 pages with 4 and more scripts running it is between 30,000 and 37,000 pages Running only 2 or 1 scripts, I stopped them after half an hour when they retrieved 310,000 pages and 160,000 pages respectively.

I've found this similar question and tried changing http.globalAgent.maxSockets but that didn't have any effect.

This is the part of the code which listens for the URLs and retrieves the data from elastic search.

app.get('/wiki/:contentId', (req, res) ->
    http.get(elasticSearchUrl(req.params.contentId), (innerRes) ->
        if (innerRes.statusCode != 200)
            res.send(innerRes.statusCode)
            innerRes.resume()
        else
            body = ''
            innerRes.on('data', (bodyChunk) ->
                body += bodyChunk
            )
            innerRes.on('end', () ->
                res.render('page', {'title': req.params.contentId, 'content': JSON.parse(body)._source.html})
            )
    ).on('error', (e) ->
        console.log('Got error: ' + e.message)  # the error is reported here
    )
)

UPDATE:

After looking more into it, I understand now the root of the problem. I ran the command netstat -an | grep -e tcp -e udp | wc -l several times during my test runs, to see how many ports are used, as described in the post Linux: EADDRNOTAVAIL (Address not available) error. I could observe that at the time I received the EADDRNOTAVAIL-error, 56677 ports were used (instead of ~180 normally)

Also when using only 2 simultaneous scripts, the number of used ports is saturated at around 40,000 (+/- 2,000), that means ~20,000 ports are used per script (that is the time when node-js cleans up old ports before new ones are created) and for 3 scripts running it breaches over the 56677 ports (~60,000). This explains why it fails with 3 scripts requesting data, but not with 2.

So now my question changes to - how can I force node-js to free up the ports quicker or to reuse the same port all the time (would be the preferable solution)

Thanks

like image 441
peter Avatar asked Oct 21 '22 14:10

peter


1 Answers

For now, my solution is setting the agent of my request options to false this should, according to the documentation

opts out of connection pooling with an Agent, defaults request to Connection: close.

as a result my number of used ports doesn't exceed 26,000 - this is still not a great solution, even more since I don't understand why reusing of ports doesn't work, but it solves the problem for now.

like image 197
peter Avatar answered Oct 27 '22 09:10

peter