Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is a node.js app that both servs a rest-api and handles web sockets a good idea?

Disclaimer: I'm new to node.js so I am sorry if this is a weird question :)

I have a node.js using express.js to serv a REST-API. The data served by the REST-API is fetched from a nosql database by the node.js app. All clients only use HTTP-GET. There is one exception though: Data is PUT and DELETEd from the master database (a relational database on another server). The thought for this setup is of course to let the 'node.js/nosql database' server(s) be a public front end and thereby protecting the master database from heavy traffic.

Potentially a number of different client applications will use the REST-API, but mainly it will be used by a client app with a long lifetime (typically 0.5 to 2 hours). Instead of letting this app constantly polling the REST-API for possible new data I want to use websockets so that data is only sent to client when there is any new data. I will use a node.js app for this and probably socket.io so that it could fall back to api-polling if websockets are not supported by the client. New data should be sent to clients each time the master database PUTs or DELETEs objects in the nosql database.

The question is if I should use one node.js for both the API and the websockets or one for the API and one for the websockets.

Things to consider: - Performance: The app(s) will be hosted on a cluster of servers with a load balancer and a HTTP accelerator in front. Would one app handling everything perform better than two apps with distinct tasks? - Traffic between app: If I choose a two app solution the api app that receives PUTs and DELETEs from the master database will have to notice the websocket app every time it receives new data (or the master database will have to notice both apps). Could the doubled traffic be a performance issue? - Code cleanlines: I believe two apps will result in cleaner and better code, but then again there will surely be some common code for both apps which will lead to having two copies it.

As to how heavy the load can be it is very difficult to say, but a possible peak can involve: 50000 clients each listening to up to 5 different channels new data being sent from master each 5th second new data should be sent to approximately 25% of the clients (for some data it should be sent to all clients and other data probably below 1% of the clients)

UPDATE: Thanks for the answers guys. More food for thoughts here. I have decided to have two node.js apps, one for the REST-API and one for web sockets. The reason is that I belive it will be easier to scale them. To begin with the whole system will be hosted on three physical servers and one node.js app for the REST-API on each server should bu sufficient, but for the websocket app there probably needs to several instances of it on each physical server.

like image 481
Dagligleder Avatar asked Sep 11 '15 21:09

Dagligleder


3 Answers

This is a very good question.

If you are looking at a legacy system, and you already have a REST interface defined, there is not a lot of advantages to adding WebSockets. Things that may point you to WebSockets would be:

  • a demand for server-to-client or client-to-client real-time data
  • a need to integrate with server-components using a classic bi-directional protocol (e.g. you want to write an FTP or sendmail client in javascript).

If you are starting a new project, I would try to have a hard split in the project between:

  • the serving of static content (images, js, css) using HTTP (that was what it was designed for) and

  • the serving of dynamic content (real-time data) using WebSockets (load-balanced, subscription/messaging based, automatic reconnect enabled to handle network blips).

So, why should we try to have a hard separation? Let's consider the advantages of a HTTP-based REST protocol.

The use of the HTTP protocol for REST semantics is an invention that has certain advantages

  • Stateless Interactions: none of the client's context is to be stored on the server side between the requests.
  • Cacheable: Clients can cache the responses.
  • Layered System: undetectability of intermediaries
  • Easy testing: it's easy to use curl to test an HTTP-based protocol

On the other hand...

The use of a messaging protocol (e.g. AMQP, JMS/STOMP) on top of WebSockets does not preclude any of these advantages.

  • WebSockets can be transparently load-balanced, messages and state can be cached, efficient stateful or stateless interactions can be defined.

  • A basic reactive analysis style can define which events trigger which messages between the client and the server.

Key additional advantages are:

  • a WebSocket is intended to be a long-term persistent connection, usable for multiple different messaging purpose over a single connection

  • a WebSocket connection allows for full bi-directional communication, allowing data to be sent in either direction in sympathy with network characteristics.

  • one can use connection offloading to share subscriptions to common topics using intermediaries. This means with very few connections to a core message broker, you can serve millions of connected users efficiently at scale.

  • monitoring and testing can be implemented with an admin interface to send/recieve messages (provided with all message brokers).

  • the cost of all this is that one needs to deal with re-establishment of state when the WebSocket needs to reconnect after being dropped. Many protocol designers build in the notion of a "sync" message to provide context from the server to the client.

Either way, your model object could be the same whether you use REST or WebSockets, but that might mean you are still thinking too much in terms of request-response rather than publish/subscribe.

like image 192
nowucca Avatar answered Oct 12 '22 16:10

nowucca


The first thing you must think about, is how you're going to scale the servers and manage their state. With a REST API this is largely straightforward, as they are for the most part stateless, and every load balancer knows how to proxy http requests. Hence, REST APIs can be scaled horizontally, leaving the few bits of state to the persistence layer (database) to deal with. With websockets, often times its a different matter. You need to research what load balancer you're going to use (if its a cloud deployment, often times it depends on the cloud provider). Then figure out what type of websocket support or configuration the load balancer will need. Then depending on your application, you need to figure out how to manage the state of your websocket connections across the cluster. Think about the different use cases, e.g. if a websocket event on one server alters the state of the data, will you need to propagate this change to a different user on a different connection? If the answer is yes, then you'll probably need something like Redis to manage your ws connections and communicate changes between the servers.

As for performance, at the end of the day its still just HTTP connections, so I doubt there will be a big difference in separating the server functionality. However, I think two servers would go a big way in improving code cleanliness, as long as you have another 'core' module to isolate code common to both servers.

like image 23
Yuri Zarubin Avatar answered Oct 12 '22 14:10

Yuri Zarubin


Personally I would do them together, this is because you can share the models and most of the code between the REST and the WS.

At the end of the day what Yuri said in his answer is correct, but is not so much work to load balance WS any way, everyone does it nowadays. The approach I took is have REST for everything and then create some WS "endpoints" for subscribing for realtime data server-client.

So for what I understood, your client would just get notifications from the server, with updates, so definitely I would go with WS. You subscribe to some events and then you get new results when there are. Keep asking with HTTP calls is not the best way.

We had this need and basically built a small framework around this idea http://devalien.github.io/Axolot/

Basically you can understand our approach in the controller (this is just an example, in our real world app we have subscriptions so we can notify when we have new data or when we finish a procedure). In actions there are the rest endpoints and in sockets the websockets endpoints.

module.exports = {
    model: 'user', // We are attaching the user to the model, so CRUD operations are there (good for dev purposes)
    path: '/user', // Tthis is the end point

    actions: {
        'get /': [
            function (req, res) {
                var query = {};

                Model.user.find(query).then(function(user) { // Find from the User Model declared above
                    res.send(user);
                }).catch(function (err){
                    res.send(400, err);
                });
            }],
    },
    sockets: {
        getSingle: function(userId, cb) { // This one is callable from socket.io using "user:getSingle
            Model.user.findOne(userId).then(function(user) {
                cb(user)
            }).catch(function (err){
                cb({error: err})
            });
        }
    }
};
like image 2
DevAlien Avatar answered Oct 12 '22 14:10

DevAlien