Spring Kafka Partitioning

Tags:

What is the difference in the behavior of the below two code snippets to publish a message?

Approach 1

Message<String> message = MessageBuilder.withPayload("testmsg")
        .setHeader(KafkaHeaders.MESSAGE_KEY, "key").setHeader(KafkaHeaders.TOPIC, "test").build();

ListenableFuture<SendResult<String, String>> future = kafkaTemplate.send(message);

Approach 2

Click to copy

ListenableFuture<SendResult<String, String>> future = kafkaTemplate.send("test", "testmsg");

Topic Config:

Click to copy

$ bin/kafka-topics.sh --describe --zookeeper localhost:2181 --topic test
Topic:test   PartitionCount:3    ReplicationFactor:1 Configs:
Topic: test  Partition: 0    Leader: 0   Replicas: 0 Isr: 0
Topic: test  Partition: 1    Leader: 0   Replicas: 0 Isr: 0
Topic: test  Partition: 2    Leader: 0   Replicas: 0 Isr: 0

Observation:

If there are 3 consumers, one per partition; Approach 1 leads to all messages consumed by a single consumer from a single partition. With Approach 2; consumption is equally split between the 3 partitions/consumers.

731

asked Aug 07 '17 21:08

srini

1 Answers

But you have an answer in your code. The first one alongside with the topic provides messageKey.

The messageKey is really used to determine target partition if isn't specified explicitly:

Click to copy

/**
 * computes partition for given record.
 * if the record has partition returns the value otherwise
 * calls configured partitioner class to compute the partition.
 */
private int partition(ProducerRecord<K, V> record, byte[] serializedKey, byte[] serializedValue, Cluster cluster) {
    Integer partition = record.partition();
    return partition != null ?
            partition :
            partitioner.partition(
                    record.topic(), record.key(), serializedKey, record.value(), serializedValue, cluster);
}

where DefaultPartitioner does this:

Click to copy

List<PartitionInfo> partitions = cluster.partitionsForTopic(topic);
int numPartitions = partitions.size();
if (keyBytes == null) {
    int nextValue = nextValue(topic);
        ...
} else {
   // hash the keyBytes to choose a partition
   return Utils.toPositive(Utils.murmur2(keyBytes)) % numPartitions;
}

So, all messages with the same key are sent to the same partition. Otherwise they are placed to the topic round-robin manner.

147

answered Oct 14 '22 04:10

Artem Bilan

Related questions
                            
                                Kafka 10 - Python Client with Authentication and Authorization
                            
                                Not able to delete Topics from kafka
                            
                                Request messages between two timestamps from Kafka
                            
                                How to display a streaming DataFrame (as show fails with AnalysisException)?
                            
                                Event vs Topic Apache Kafka
                            
                                How to alter the TTL for a particular topic in Kafka
                            
                                How to check which partition is a key assign to in kafka?
                            
                                Spark Structured Streaming - Limitations? (Source Performance, Unsupported Operations, Spark UI)
                            
                                Kafka Connect REST Interface "PUT /connectors/(string: name)/config" Return Error Code 500
                            
                                Kafka Streams Persistent Store cleanup
                            
                                Java: G1 Old generation garbage collection count is 0
                            
                                Is Kafka timestamp order corresponding to the offset?
                            
                                Kafka: isolation level implications
                            
                                Consumer Id and Group Id in Kafka: what makes two consumers the same
                            
                                Symfony Messenger with Apache Kafka as queue transport
                            
                                Kafka Log Compacted Topic Duplication Values against same key not deleted
                            
                                Camel Kafka Integration
                            
                                Explain replication-offset-checkpoint AND recovery-point-offset in Kafka
                            
                                How Spark RDD partitions are processed if no. of executors < no. of RDD partition
                            
                                Kafka command line producer/consumer have 1 second latency

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Spring Kafka Partitioning

Tags:

apache-kafka

kafka-consumer-api

kafka-producer-api

spring-kafka

srini

People also ask

1 Answers

Artem Bilan

Recent Activity

Donate For Us