I have been trying to connect with kafka-avro-console-consumer from Confluent to our legacy Kafka cluster, which was deployed without Confluent Schema Registry. I provided schema explicitly using properties like:
kafka-console-consumer --bootstrap-server kafka02.internal:9092 \
--topic test \
--from-beginning \
--property key.schema='{"type":"long"}' \
--property value.schema='{"type":"long"}'
but I am getting 'Unknown magic byte!' error with org.apache.kafka.common.errors.SerializationException
Is it possible to consume Avro messages from Kafka using Confluent kafka-avro-console-consumer that were not serialized with AvroSerializer from Confluent and with Schema Registry?
The Confluent Schema Registry serialiser/deserializer uses a wire format which includes information about the schema ID etc in the initial bytes of the message.
If your message has not been serialized using the Schema Registry serializer, then you won't be able to deserialize it with it, and will get the Unknown magic byte!
error.
So you'll need to write a consumer that pulls the messages, does the deserialization using your Avro avsc schemas, and then assuming you want to preserve the data, re-serialize it using the Schema Registry serializer
Edit: I wrote an article recently that explains this whole thing in more depth: https://www.confluent.io/blog/kafka-connect-deep-dive-converters-serialization-explained
kafka-console-consumer
has no knowledge about key.schema
or value.schema
, only the Avro producer does. Source code here
The regular console consumer doesn't care about the format of the data - it'll just print UTF8 encoded bytes
The property that kafka-avro-console-consumer
accepts is only schema.registry.url
. So, to answer the question, yes, it needs to be serialized using the Confluent serializers.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With