Is it possible to scale out Node-RED horizontally on a cluster of nodes? Could not find any documentation on the same. My scenario is to handle millions of events per second and process them in real time using Node-RED.
I posted the question on the Google Groups Node-RED forum (https://groups.google.com/forum/#!topic/node-red/Nx1WWqBeLbI) and got interesting answers. Jotting down the various options below.
If your input is over HTTP, then you can use any of the standard load-balancing techniques to load balance requests over a cluster of nodes running the same Node-RED flow - e.g. one can use HAProxy, Nginx, etc. It is important to note that since we are running the same flow over many nodes, we cannot store any state in context variables. We have to store state in an external service such as Redis.
If you are ingesting over MQTT, then we have multiple options: Option A: Let each flow listen to a different topic. You can have different gateways publish to different topics on the MQTT broker - e.g. Flow instance 1 subscribes to device/a/# Node-RED instance 2 subscribe to device/b/# and so on.
Option B: Some MQTT brokers support the concept of 'Shared Subscription' (HiveMQ) that is equivalent to point-to-point messaging - i.e. each consumer in a subsciption group gets a message and then the broker load-balances using round-robin. A good explanation on how to enable this using HiveMQ is given here - http://www.hivemq.com/blog/mqtt-client-load-balancing-with-shared-subscriptions/. The good thing about the HiveMQ support for load-balancing consumers is that there is no change required in the consumer code. You can continue using any MQTT consumer - only the topic URL would change :)
Option C: You put a simple Node-RED flow for message ingestion that reads the payload and makes an HTTP request to a cluster of load-balanced Node-RED flows (similar to Option 1)
Option D: This is an extension to Option C and entails creating a buffer between message ingestion and message processing using Apache Kafka. We ingest the message from devices over MQTT and extract the payload and post it on a Kafka topic. Kafka can support a message-queue paradigm using the concept of consumer groups. Thus we can have multiple node-red flow instances subscribing to the Kafka topic using the same consumer group. This option also makes sense, if your message broker does not support load-balancing consumers.
Have posted a blog post with links here - http://www.narendranaidu.com/2016/07/scaling-node-red-horizontally-for-high.html
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With