Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is key required as part of sending messages to Kafka?

KeyedMessage<String, byte[]> keyedMessage = new KeyedMessage<String, byte[]>(request.getRequestTopicName(), SerializationUtils.serialize(message)); 
producer.send(keyedMessage);

Currently, I am sending messages without any key as part of keyed messages, will it still work with delete.retention.ms? Do I need to send a key as part of the message? Is this good to make key as part of the message?

like image 381
gaurav Avatar asked Oct 05 '22 18:10

gaurav


People also ask

What is the use of Key while sending messages in Kafka?

Kafka message keys can be string values or Avro messages, depending on how your Kafka system is configured. The format of the message keys determines how message key values are stored in the record, and how you work with those values.

How do I send a message to Kafka?

Step1: Start the zookeeper as well as the kafka server. Step2: Type the command: 'kafka-console-producer' on the command line. This will help the user to read the data from the standard inputs and write it to the Kafka topic.

What is key-value in Kafka topic?

A key-value pair defined for a single Kafka Streams record. If the record comes directly from a Kafka topic then its key/value are defined as the message key/value.

What is message key in partition in Kafka?

Apache Kafka require a message key for different purposes, such as: Partitioning: Kafka can guarantee ordering only inside the same partition and it is therefore important to be able to route correlated messages into the same partition.


1 Answers

Keys are mostly useful/necessary if you require strong order for a key and are developing something like a state machine. If you require that messages with the same key (for instance, a unique id) are always seen in the correct order, attaching a key to messages will ensure messages with the same key always go to the same partition in a topic. Kafka guarantees order within a partition, but not across partitions in a topic, so alternatively not providing a key - which will result in round-robin distribution across partitions - will not maintain such order.

In the case of a state machine, keys can be used with log.cleaner.enable to deduplicate entries with the same key. In that case, Kafka assumes that your application only cares about the most recent instance of a given key and the log cleaner deletes older duplicates of a given key only if the key is not null. This form of log compaction is controlled by the log.cleaner.delete.retention property and requires keys.

Alternatively, the more common property log.retention.hours, which is enabled by default, works by deleting complete segments of the log that are out of date. In this case keys do not have to be provided. Kafka will simply delete chunks of the log that are older than the given retention period.

That's all to say, if you've enabled log compaction or require strict order for messages with the same key then you should definitely be using keys. Otherwise, null keys may provide better distribution and prevent potential hot spotting issues in cases where some keys may appear more than others.

like image 274
kuujo Avatar answered Oct 08 '22 06:10

kuujo