I am new to NATS Jetstream and I have been reading their official documentation (https://docs.nats.io/jetstream/jetstream) to understand its concepts and compare it with Kafka. One of the major use cases I have, is to solve for message/event ordering based on a particular id (like a partition key
in the Kafka world).
For example, there are several update events coming for an Order
entity and my system needs to consume the events for a particular Order
in the same order. In this case, I would use the order-id
as the partition key while publishing to the Kafka topic. How do I accomplish this in Jetstream?
I have come across a de-duplication key (Nats-Msg-Id
) in Jetstream, but I think this feature is more synonymous with topic compaction in Kafka. Am I right?
Nevertheless, I have written the following code in Golang for publishing:
order = Order{
OrderId: orderId,
Status: status,
}
orderJson, _ := json.Marshal(order)
dedupKey := nats.MsgId(order.OrderId)
_, err := js.Publish(subjectName, orderJson, dedupKey)
Am I doing this right? Will all orders for a particular orderId go to the same consumer within a consumer group in the Jetstream world, hence maintaining the sequence?
This is what I get from @tbeets' suggestion. For example, I have predefined 10 stream subjects like ORDER.1
, ORDER.2
,ORDER.3
.... ORDER.10
On the publishing side, I can do an order-id%10+1
to find the exact stream subject to which I would want to publish. So here, we have accomplished that all update events for the same orderId will go to the same stream subject every time.
Now, on the subscriber side, I have 10 consumer groups (there are 10 consumers within each consumer group) and each consume from a particular stream subject, like consumerGroup-1
consumes from ORDER.1
, consumerGroup-2
consumes from ORDER.2
and so on...
Say, 2 order update events came for order-id
111, which would get mapped to ORDER.1
stream subject, and correspondingly consumerGroup-1
will consume these 2 events. But within this consumerGroup, the 2 update events can go to different consumers and if one of the consumers is a bit busy or slow, then at an overall level, the order update events consumption maybe out-of-sync or out-of-order.
Kafka solves this using the concept of partition key as consumers of a consumer group are allocated to a particular partition. Hence, all events for the same orderId, are consumed by the same consumer, hence, maintaining the sequence of order update event consumption. How do I solve this issue in Jetstream?
In NATS, your publish subject can contain multiple delimited tokens. So for instance your Order event could be published to ORDER.{store}.{orderid} where the last two tokens are specific to each event and provide whatever slice-and-dice dimensions you need for your use case.
You then define a JetStream for ORDER.> (i.e. all of the events). N number of Consumers (ephemeral or durable) can be created on the JetStream, each with an optional filter definition to your use case needs (e.g. ORDER.Store24.>) on the underlying stream's messages. JetStream guarantees that messages (filtered or unfiltered) are delivered in the order they were received.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With