To explain the problem:
Assume there are 100 requests/second arriving.
So the load per server is automatically balanced as i add/remove servers, since each connection is so short lived.
Assume there are 100 clients, 2 servers (behind a LB)
Since websocket connections are persistent, adding/removing a server does not increase/decrease load per server until the clients decide to reconnect.
How does one then efficiently scale websockets and manage load per server?
Compared to REST, WebSockets allow for higher efficiency and are easier to scale because they do not require the HTTP request/response overhead for each message sent and received. Furthermore, the WebSocket protocol is push-based, enabling you to push data to connected clients as soon as events occur.
With at least 30 GiB RAM you can handle 1 million concurrent sockets.
But why are WebSockets hard to scale? The main challenge is that connections to your WebSocket server need to be persistent. And even once you've scaled out your server nodes both vertically and horizontally, you also need to provide a solution for sharing data between the nodes.
Horizontal Scaling It needs to be configured with some other tools in order to build a fully scalable architecture. Using a Publish/Subscribe or pub/sub broker is an effective method of horizontally scaling WebSockets. There are several off-the-shelf solutions like Kafka or Redis that can make this happen.
This is similar to problems the gaming industry has been trying to solve for a long time. That is an area where you have many concurrent connections and you have to have fast communication between many clients.
Options:
This prevents your clients from blowing up a single server. You'll have to have the client poll the master before establishing the WS connection but that is simple.
This way you can also scale out to multi master if you need to and put them behind load balancers.
If you need to send a message between servers there are many options for that (handle it yourself, queues, etc).
This is how my drawing app Pixmap for Android, which I built last year, works. Works very well too.
Client side load balancing where client connects to a random host name. This is how Watch.ly works. Each host can then be its own load balancer and cluster of servers for safety. Risky but simple.
Traditional load balancing - ie round robin. Hard to beat haproxy. This should be your first approach and will scale to many thousands of concurrent users. Doesn't solve the problem of redistributing load though. One way to solve that with this setup is to push an event to your clients telling them to reconnect (and have each attempt to reconnect with a random timeout so you don't kill your servers).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With