Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Node.js requests randomly begin to hang and won't clear until server restart

I've been running into a really odd and seemingly random issue on our web app that I just can't seem to successfully debug. It runs fine for anywhere from 10 minutes to 6 hours, and then all of a sudden no remote requests to or from the server can be made, they just hang (this includes regular http and web socket requests). The odd thing is that going to the site regularly still works, until the OS file descriptor limit is reached and then http completely crashes with all of the stalled connections.

There are no errors, though the following error is thrown when the issue begins (I assume this is a side-effect of whatever is going on rather than the cause).

TypeError: Cannot read property '0' of null
    at null.<anonymous> (/app/node_modules/mongojs/node_modules/mongodb/lib/mongodb/collection.js:504:22)
    at args.(anonymous function) (/app/node_modules/strong-agent/lib/proxy.js:85:18)
    at g (events.js:175:14)
    at EventEmitter.emit (events.js:98:17)
    at Base.__executeAllServerSpecificErrorCallbacks (/app/node_modules/mongojs/node_modules/mongodb/lib/mongodb/connection/base.js:315:29)
    at /app/node_modules/mongojs/node_modules/mongodb/lib/mongodb/connection/repl_set/ha.js:273:22
    at /app/node_modules/mongojs/node_modules/mongodb/lib/mongodb/connection/repl_set/ha.js:370:11
    at /app/node_modules/mongojs/node_modules/mongodb/lib/mongodb/connection/repl_set/ha.js:352:28
    at _callback (/app/node_modules/mongojs/node_modules/mongodb/lib/mongodb/db.js:670:5)
    at /app/node_modules/mongojs/node_modules/mongodb/lib/mongodb/auth/mongodb_cr.js:47:13

I've tried raising the file descriptor limits and the global agent maxSockets with no affect on this behavior. There's no influx of traffic when this happens, and it happens equally as often during peak and off-peak times. The CPU usage consistently stays below 5% and doesn't have any perceptible changes leading up to or during the crash. The server also never drops below 1GB of free memory.

The stack: SmartOS cloud server (Joyent), Express, Socket.io, MongoDB and Redis.

I've been debugging this for several days and have completely run out of ideas where to look. Hoping someone on SO has run into something similar or has different ideas of what can be tried or tested.

like image 299
James Simpson Avatar asked Feb 15 '23 07:02

James Simpson


1 Answers

After countless hours of debugging and more debugging, I finally found the culprit. An error was being thrown inside of several different mongojs callbacks, which appears to have bubbled up and blocked the connections from closing. Over time, this got to a tipping point and connections started hanging until the file descriptor limit was reached.

The error turned out to be in the Now.js node module (which has been abandoned). If there is anyone out there that is running into this issue using Now.js, I forked it and patched the bug. You you can find the commit here: https://github.com/goldfire/now/commit/b5bd54f8950602f752a710c606be6754b759cab2.

The way I found this bug was to attach an error listener to the DB object:

var db = require('mongojs').connect('...', ['collection']);
db.client.on('error', function(err){
  console.log(err.stack);
});
like image 193
James Simpson Avatar answered Feb 26 '23 22:02

James Simpson