Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

node.js server leaking TCP connections?

Note: see my edit at the end of the post.

I have a node.js (Express) server which is serving approximately 15-30 requests/second. I am serving a bunch of simple JADE templates, and a Durandal SPA application, with the bulk of the requests being for the simple JADE templates. Everything goes fine for a couple of minutes, however the server starts getting EMFILE errors after a while and eventually crashes. After troubleshooting a bit, I found out that the output of lsof -i -n -P | grep node after a while contains a huge amount of rows of this kind:

node    8800 my_user   13u  IPv4 906628      0t0  TCP 172.x.x.x:3000->x.x.x.x:44654 (ESTABLISHED)
node    8800 my_user   14u  IPv4 908407      0t0  TCP 172.x.x.x:3000->x.x.x.x:13432 (ESTABLISHED)
node    8800 my_user   15u  IPv4 908409      0t0  TCP 172.x.x.x:3000->x.x.x.x:38814 (ESTABLISHED)
node    8800 my_user   19u  IPv4 906622      0t0  TCP 172.x.x.x:3000->x.x.x.x:56743 (ESTABLISHED)
node    8800 my_user   20u  IPv4 907221      0t0  TCP 172.x.x.x:3000->x.x.x.x:46897 (ESTABLISHED)
...

I am a beginner with node.js, but it looks like it's unable to dismiss already completed connections, which eventually leads to EMFILE and crashes.

I already tried the following:

  • ulimit -n 2048: this is obviously a temporary solution, it delays the EMFILE errors but doesn't solve the issue
  • lowering the response timeout (which is 2 minutes by default, if I recall correctly) to something closer to 5-10 seconds

With both of those adjustments in place, the server takes MUCH longer to crash, but still does so eventually. Even without any load, it seems unable to dispose the "stuck" TCP ESTABLISHED connections, and when the requests start arriving again, the number of file descriptors opened keep growing and eventually crashes the process.

My node.js server (in coffeescript) looks like this (I'm using mimosa to start up the server, but I don't think it should make any difference):

express = require 'express'
engines = require 'consolidate'

fs      = require 'fs'
http    = require 'http'
https   = require 'https'

options =
    ca: fs.readFileSync __dirname + '/ssl/ca.pem'
    key: fs.readFileSync __dirname + '/ssl/key.pem'
    cert: fs.readFileSync __dirname + '/ssl/cert.pem'

exports.startServer = (config, callback) ->

    app = express()

    app.configure ->
        app.set 'port', config.server.port
        app.set 'views', config.server.views.path
        app.engine config.server.views.extension, engines[config.server.views.compileWith]
        app.set 'view engine', config.server.views.extension
        app.use express.logger({ format: ":date :method :remote-addr :url :response-time" })
        app.use express.favicon __dirname + '/public/favicon.ico'
        app.use express.bodyParser()
        app.use express.methodOverride()
        app.use express.compress()
        app.use express.static(config.watch.compiledDir)
        app.use config.server.base, app.router

    app.configure 'development', ->
        app.use express.errorHandler()

    app.get '/my/route/n1', (req, res) ->
        res.render "./my/template/n1"
    app.get '/my/route/n2', (req, res) -> # route getting the bulk of requests
        res.setTimeout(10000) # timeout introducted attempting to fix the problem
        res.render "./my/template/n2"
    app.get '/my/route/n3', (req, res) ->
        res.render "./my/template/n3"
    app.get '*/?', (req, res) -> res.render 'index'

    server = https.createServer options, app
    server.listen config.server.port, ->
        console.log "Express server listening on port %d in %s mode", server.address().port, app.settings.env

    callback server

I think node.js shouldn't have any problem serving this amount of requests, so I think it's a misconfiguration on my part or something along those lines. What I'm doing wrong? Thank you!

P.S.: I edited out a bunch of stuff from the code/output of lsof both for privacy concerns and also because it should be irrelevant for the issue; however, if any other info is needed, I'll update the question to provide it as soon as possible.

EDIT: I think I found the source of my issue. The connection that Express uses to serve ./my/template/n2 is indeed timing out after 10 seconds, however the connections used by express.static to serve images, css and other static resources are not (well they are, but they take 2-5 minutes to release their file descriptor...). I guess my question then reduces to: how to set the response timeout for the files served by express.static? I tried using app.use express.timeout(10000) before every other middleware, but it only seems to work for the main JADE file and not for images or css.

I am using Express 3. Thank you again in advance.

like image 752
pmarchezz Avatar asked Oct 21 '22 06:10

pmarchezz


1 Answers

The issue appears solved after adding this middleware before every other app.use call:

        app.use (req, res, next) ->
            res.setTimeout(10000)
            next()

I doubt it's the most elegant way to solve the issue but it's working fine right now.

like image 88
pmarchezz Avatar answered Oct 23 '22 02:10

pmarchezz