So I've loved the idea of Kafka since I first heard of it but I haven't had the opportunity to get hands-on with it until recently. I think I have a use case that might apply but I'd like to get some opinions from people who are more familiar with it.
Basically I'm thinking about a notification system that would batch messages over a given period of time (say 30 minutes) and send them out as emails, in-app notifications, or otherwise. I like Kafka for this problem primarily because of its inherent durability. I had considered using a more straightforward message queue like RabbitMQ, ActiveMQ, SQS, etc. but I don't like that it would force me to manage buffering on the consumer-side and risk losing messages. Otherwise I would have to buffer in a secondary durable store which seems to defeat the purpose of having the queue in the first place.
So my idea would be to group the notifications in partitions by user and then every 30 minutes the consumer would read the last 30 minutes of data, aggregate it, and send a summary notification composed of the individual notifications.
I have a few concerns:
Thanks for your feedback!
A Kafka streams micorservice that is responsible for process the email event stream and send out emails.
Apache Kafka is a robust messaging queue that enables the transfer of high volumes of messages from one end-point to other. Creating a Kafka Batch Process allows for processing multiple messages with ease.
Within a partition, Apache Kafka guarantees that the order of records is maintained: thus when a producer sends the contents of a batch to a partition in Apache Kafka, the individual records within the partition maintain their order.
type property. We can store the large messages in a file at the shared storage location and send the location through Kafka message. This can be a faster option and has minimum processing overhead. Another option could be to split the large message into small messages of size 1KB each at the producer end.
It might be too late to answer this question now and I think you might have a solution already. For other users who are thinking about the same thing, I'd like to say that your idea is pretty good especially when considering using Kafka Streams. I am building a project called light-email now with Kafka Streams and Kotlin. Currently, I am thinking to send out email per event; however, it would be very easy to aggregate multiple events together within a time window in Kafka Streams.
To clarify two points from the comments.
We don't need to create a partition per user. Just need to ensure that the events belong to the same user goes to the same partition. This simply means that we need to hash the userid to load balance between partitions.
When message sending fails, it should be moved to the dead letter topic to process later. This is to prevent the current topic is blocked.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With