Messaging platform with QoS / Kafka partition overloading

Tags:

I'm having a recurrent issue with Kafka: I partition messages by customer id, and sometimes it happens that a customer gets a huge amount of messages. As a result, the messages of this customer and all other customers in the same partition get delayed.

Are there well-known ways to handle this issue? Possibly with other messaging platforms?

Ideally, only the messages of one customer would be delayed. Other customer's messages would get an equal share of consumers' bandwidth.

Note: I must partition by customer id, because I want to consume the messages of any given custom in order. However, I can consume the messages of two customers in any order.

827

asked May 23 '18 09:05

Arnaud Le Blanc

1 Answers

I will try and answer based on the limited information porovided.

Kafka partitoins are the smalles unit of scalability, so for example, if you have 10 parallel consumers (kafka topic listeners) you should partiton your topic by this number or higher otherwise, some of your listeners will bet starved as kafka manage the consumers in a way that only one consumer will be getting messages from a partiton. This is to protect the partiton from mixing messages order. The other way is supported as consumers can handle more than one partiton at a time.

My design solution will be to decide how much capacity are you planning to allocate for the consumers (microservices) instances? This number will guide you to the right number of partitons.

I would avoid using a dynamic number of partitons as this does not scale well. Use the number that match the capacity you plan to allocate and some extra spare in the case you need to scale up in the future. Let's say tomorrow you have 5 new customers, adding partitons is not easy or wise.

Kafka will make sure the messages stay in order per partition so this is free for your use case. What you need is on the consumer end to be able to handle the different customer ID messages in the right order. To avoid messages to the same customer get mixed order your partiton must be a higher level category of customers, I can think of customer type/region/size ... The idea is that all of a single customer messages stay in the same topic.

Your partitoin key must relate to the size of messages/data so your messages spread eavenly over your kafka cluster. This helps with the kafka cluster scale & redundency itself.

deciding on the right partitioning strategy is hard but it is worth the time spent on planning it.

One design solution come up a lot is hashing. Map a partition number using a HASH from customer ID to a partiton key. Again, decide on a fixed partiton number and let the HASH map the customer ID to your partiton key.

using X modulo partitions

X customers have a lot of messages and you need to have one topic per customer. so in this case you map a customer per topic so your modulo will be the number of these customers.

Y customers are low trafic customers, for these customers use a different modulo of Y/5 for example so you have 5 customers sharing a topic.

make sure you add the X partiton number to the Y partition number so you dont overlap.

the only issue I see is this is not flexible, you cannot change the mapping if the number of customers changes. You might allow more topics in each group to support future partitons.

answered Oct 26 '22 22:10

Gadi

Related questions
                            
                                Testing window aggregation with Kafka Streams
                            
                                How can I assure consistency when using an event-carried state transfer approach in Kafka
                            
                                Kafka JDBC Sink Connector: no tasks assigned
                            
                                How to send and consume json messages using confluent-kafka in Python
                            
                                No JAAS configuration section named 'Server' was foundin '/kafka/kafka_2.12-2.3.0/config/zookeeper_jaas.conf'
                            
                                Kafka stream: "TopicAuthorizationException: Not authorized to access topics" for an internal state store
                            
                                How can I disable spring cloud stream for development purpose when there are not kafka broker running?
                            
                                Filter the records by a certain value in kafka connect
                            
                                Kafka Serialization of an object [duplicate]
                            
                                Best practice for integrating Kafka and HBase
                            
                                Kafka check queue size
                            
                                how to fetch a field in ConsumerRecord
                            
                                Kafka Stream from JSON to Avro
                            
                                Not able to connect to wurstmeister/kafka
                            
                                spark streaming assertion failed: Failed to get records for spark-executor-a-group a-topic 7 244723248 after polling for 4096
                            
                                For AvroProducer to Kafka, where are avro schema for "key" and "value"?
                            
                                Event Sourcing: concurrently creating conflicting events
                            
                                How to convert bytes from Kafka to their original object?
                            
                                Java Kafka consumer group failing to consume a few messages
                            
                                Using Spark Structured Streaming to Read Data From Kafka, Issue of Over-time is Always Occured

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Messaging platform with QoS / Kafka partition overloading

Tags:

apache-kafka

messaging

Arnaud Le Blanc

People also ask

1 Answers

using X modulo partitions

Gadi

Recent Activity

Donate For Us