Producing and Consuming Avro messages from Kafka without Confluent components

1 Answers

Of course you can do that without any Confluent tooling. But you have to do additional work on your side (e.g. in your application code) -- which was the original motivation of providing Avro-related tooling such as the ones from Confluent that you mentioned.

One option is to manually serialize/deserialize the payload of your Kafka messages (e.g. from YourJavaPojo to byte[]) by using the Apache Avro Java API directly. (I suppose that you implied Java as the programming language of choice.) How would this look like? Here's an example.

First, you would manually serialize the data payload in your application that writes data to Kafka. Here, you could use the Avro serialization API for the encoding the payload (from Java pojo to byte[]), and then use Kafka's Java producer client for the writing encoded payload to a Kafka topic.
Then, downstream in your data pipeline, you would deserialize in another application that reads data from Kafka. Here, you could use Kafka's Java consumer client for the reading the (encoded) data from the same Kafka topic, and use the Avro deserialization API for the decoding the payload back again (from byte[] to Java pojo).

You can also use the Avro API directly, of course, when working with stream processing tools like Kafka Streams (will be included in upcoming Apache Kafka 0.10) or Apache Storm.

Lastly, you also have the option to use some utility libraries (whether from Confluent or elsewhere) so that you don't have to use the Apache Avro API directly. For what it's worth, I have published some slightly more complex examples at kafka-storm-starter, e.g. as demonstrated by AvroDecoderBolt.scala. Here, the Avro serialization/deserialization is done by using the Scala library Twitter Bijection. Here's an example snippet of AvroDecoderBolt.scala to give you the general idea:

  // This tells Bijection how to automagically deserialize a Java type `T`,
  // given a byte array `byte[]`.
  implicit private val specificAvroBinaryInjection: Injection[T, Array[Byte]] =
SpecificAvroCodecs.toBinary[T]

  // Let's put Bijection to use.
  private def decodeAndEmit(bytes: Array[Byte], collector: BasicOutputCollector) {
    require(bytes != null, "bytes must not be null")
    val decodeTry = Injection.invert(bytes)  // <-- deserialization, using Twitter Bijection, happens here
    decodeTry match {
      case Success(pojo) =>
        log.debug("Binary data decoded into pojo: " + pojo)
        collector.emit(new Values(pojo)) // <-- Here we are telling Storm to send the decoded payload to downstream consumers
        ()
      case Failure(e) => log.error("Could not decode binary data: " + Throwables.getStackTraceAsString(e))
    }
  }

So yes, you can of course opt to not use any additional libraries such as Confluent's Avro serializers/deserializers (currently made available as part of confluentinc/schema-registry) or Twitter's Bijection. Whether that's worth the additional effort is up to you to decide.

127

answered Oct 16 '22 08:10

Michael G. Noll

Related questions
                            
                                Kafka Docker - Can't produce or consume from outside of docker container
                            
                                Kafka - How to use filter and filternot at the same time?
                            
                                Spark Streaming - Batch Interval vs Processing time
                            
                                Kafka Streams - Is it possible to run remote interactive queries without a local Kafka Streams instance
                            
                                Read data from KSQL tables
                            
                                How to create a Kafka Topic using Confluent.Kafka .Net Client
                            
                                Unable to run a JDBC Source connector with Confluent REST API
                            
                                Creating a KSQL Stream: How to extract value from complex json
                            
                                Failed to delete the state directory in IDE for Kafka Stream Application
                            
                                Spring kafka and Kafka Cluster
                            
                                Kafka vs. MongoDB for time series data
                            
                                Use Avro in KafkaConnect without Confluent Schema Registry
                            
                                Kafka Mirror Maker failing to replicate __consumer_offset topic
                            
                                No exception is coming while sending message when kafka is down
                            
                                No serviceName defined in either JAAS or Kafka config (not Kerberos)
                            
                                Use kafka to detect changes on values
                            
                                How AWS MSK and Confluent Schema Registry and Confluent Kafka connect recommended to use together?
                            
                                Hbase vs Cassandra vs Kafka for high resolution time series data storage
                            
                                kafka-node ready event is not getting triggered
                            
                                kafka upgrade to .9 with new consumer api

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Producing and Consuming Avro messages from Kafka without Confluent components

Tags:

apache-kafka

kafka-consumer-api

kafka-producer-api

Knows Not Much

People also ask

1 Answers

Michael G. Noll

Recent Activity

Donate For Us