I want to push a text consisting of multiple lines as one message into a kafka topic.
After I enter:
kafka-console-producer --broker-list localhost:9092 --topic myTopic
and copy my text:
My Text consists of:
two lines instead of one
I get two messages in the kafka topic, but I want to have just one. Any ideas how to achieve that? Thanks
A consumer can be assigned to consume multiple partitions. So the rule in Kafka is only one consumer in a consumer group can be assigned to consume messages from a partition in a topic and hence multiple Kafka consumers from a consumer group can not read the same message from a partition.
A batch of data is consumed by a Kafka consumer from one cluster (called “source”) then immediately produced to another cluster (called “target”) by Kafka producer. To ensure “Exactly-once” delivery, the producer creates a new transaction through a “coordinator” each time it receives a batch of data from the consumer.
You can use kafkacat
for this, with its -D
operator to specify a custom message delimiter (in this example /
):
kafkacat -b kafka:29092 \
-t test_topic_01 \
-D/ \
-P <<EOF
this is a string message
with a line break/this is
another message with two
line breaks!
EOF
Note that the delimiter must be a single byte - multi-byte chars will end up getting included in the resulting message See issue #140
Resulting messages, inspected also using kafkacat:
$ kafkacat -b kafka:29092 -C \
-f '\nKey (%K bytes): %k\t\nValue (%S bytes): %s\n\Partition: %p\tOffset: %o\n--\n' \
-t test_topic_01
Key (-1 bytes):
Value (43 bytes): this is a string message
with a line break
Partition: 0 Offset: 0
--
Key (-1 bytes):
Value (48 bytes): this is
another message with two
line breaks!
Partition: 0 Offset: 1
--
% Reached end of topic test_topic_01 [0] at offset 2
Inspecting using kafka-console-consumer
:
$ kafka-console-consumer \
--bootstrap-server kafka:29092 \
--topic test_topic_01 \
--from-beginning
this is a string message
with a line break
this is
another message with two
line breaks!
(thus illustrating why kafkacat
is nicer to work with than kafka-console-consumer
because of its optional verbosity :) )
It's not possible with kafka-console-producer
as it uses a Java Scanner object that's newline delimited.
You would need to do it via your own producer code
With Console-consumer you are obviously running tests for your expected data coming from client. If it is a single message, better keep it as a single string by adding a unique delimiter as identifier. e.g.
{this is line one ^^ this is line two}
Then handle the message accordingly in your consumer job. Even if client is planning to send multiple sentences in message, better make it in a single string, it will improve serialization of your message and will be more efficient after serialization.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With