Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

I'm receiving duplicate messages in my clustered node.js/socket.io/redis pub/sub application

I'm using Node.js, Socket.io with Redisstore, Cluster from the Socket.io guys, and Redis.

I've have a pub/sub application that works well on just one Node.js node. But, when it comes under heavy load is maxes out just one core of the server since Node.js isn't written for multi-core machines.

As you can see below, I'm now using the Cluster module from Learnboost, the same people who make Socket.io.

But, when I fire up 4 worker processes, each browser client that comes in and subscribes gets 4 copies of each message that is published in Redis. If there are are three worker processes, there are three copies.

I'm guessing I need to move the redis pub/sub functionality to the cluster.js file somehow.

Cluster.js

var cluster = require('./node_modules/cluster');

cluster('./app')
  .set('workers', 4)
  .use(cluster.logger('logs'))
  .use(cluster.stats())
  .use(cluster.pidfiles('pids'))
  .use(cluster.cli())
  .use(cluster.repl(8888))
  .listen(8000);

App.js

redis = require('redis'),
sys = require('sys');

var rc = redis.createClient();

var path = require('path')
  , connect = require('connect')
  , app = connect.createServer(connect.static(path.join(__dirname, '../')));

// require the new redis store
var sio = require('socket.io')
  , RedisStore = sio.RedisStore
  , io = sio.listen(app);

io.set('store', new RedisStore);io.sockets.on('connection', function(socket) {
    sys.log('ShowControl -- Socket connected: ' + socket.id);

    socket.on('channel', function(ch) {
        socket.join(ch)
        sys.log('ShowControl -- ' + socket.id + ' joined channel: ' + ch);
    });

    socket.on('disconnect', function() {
        console.log('ShowControll -- Socket disconnected: ' + socket.id);
    });
});

rc.psubscribe('showcontrol_*');

rc.on('pmessage', function(pat, ch, msg) {
    io.sockets.in(ch).emit('show_event', msg);
    sys.log('ShowControl -- Publish sent to channel: ' + ch);
});

// cluster compatiblity
if (!module.parent) {
  app.listen(process.argv[2] || 8081);
  console.log('Listening on ', app.address());
} else {
  module.exports = app;
}

client.html

<script src="http://localhost:8000/socket.io/socket.io.js"></script>
<script src="https://ajax.googleapis.com/ajax/libs/jquery/1.7.0/jquery.min.js"></script>
<script>
    var socket = io.connect('localhost:8000');
    socket.emit('channel', 'showcontrol_106');
    socket.on('show_event', function (msg) {
        console.log(msg);
        $("body").append('<br/>' + msg);
    });
</script>
like image 425
Nick Messick Avatar asked Dec 02 '11 22:12

Nick Messick


2 Answers

I've been battling with cluster and socket.io. Every time I use cluster function (I use the built in Nodejs cluster though) I get alot of performance problems and issues with socket.io.

While trying to research this, I've been digging around the bug reports and similar on the socket.io git and anyone using clusters or external load balancers to their servers seems to have problems with socket.io.

It seems to produce the problem "client not handshaken client should reconnect" which you will see if you increase the verbose logging. This appear alot whenever socket.io runs in a cluster so I think it reverts back to this. I.E the client gets connected to randomized instance in the socket.io cluster every time it does a new connection (it does several http/socket/flash connections when authorizing and more all the time later when polling for new data).

For now I've reverted back to only using 1 socket.io process at a time, this might be a bug but could also be a shortcoming of how socket.io is built.

Added: My way of solving this in the future will be to assign a unique port to each socket.io instance inside the cluster and then cache port selection on client side.

like image 94
zpr Avatar answered Sep 28 '22 01:09

zpr


Turns out this isn't a problem with Node.js/Socket.io, I was just going about it the completely wrong way.

Not only was I publishing into the Redis server from outside the Node/Socket stack, I was still directly subscribed to the Redis channel. On both ends of the pub/sub situation I was bypassing the "Socket.io cluster with Redis Store on the back end" goodness.

So, I created a little app (with Node.js/Socket.io/Express) that took messages from my Rails app and 'announced' them into a Socket.io room using the socket.io-announce module. Now, by using Socket.io routing magic, each node worker would only get and send messages to browsers connected to them directly. In other words, no more duplicate messages since both the pub and sub happened within the Node.js/Socket.io stack.

After I get my code cleaned up I'll put an example up on a github somewhere.

like image 22
Nick Messick Avatar answered Sep 28 '22 01:09

Nick Messick