Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get a typed value from an Avro GenericRecord

Tags:

java

avro

Given a GenericRecord, what is the recommended way to retrieve a typed value, as opposed to an Object? Are we expected to cast the values, and if so what is the mapping from Avro types to Java types? For example, Avro Array == Java Collection; and Avro String == Java Utf8.

Since every GenericRecord contains its schema, I was hoping for a type-safe way to retrieve values.

like image 813
jaco0646 Avatar asked Dec 03 '15 15:12

jaco0646


People also ask

What is GenericRecord in Avro?

public interface GenericRecord extends IndexedRecord. A generic instance of a record schema. Fields are accessible by name as well as by index.

What is specific record in Avro?

Each Avro schema describes one or more Avro records. An Avro record is a complex data type in Avro, consisting of other fields, with their own data types (primitive or complex). Kafka record, on the other hand, consists of a key and a value and each of them can have separate serialization.


1 Answers

Avro has eight primitive types and five complex types (excluding unions which are a combination of other types). The following table maps these 13 Avro types to their input interfaces (the Java types which can be put into a GenericRecord) and their output implementations (the concrete Java types which are returned by a get from a GenericRecord). The values apply to Avro 1.7.7.

╔═══════════╦════════════════════════╦═══════════════════════════╗ ║ Avro Type ║ Input Interface ║ Output Implementation ║ ╠═══════════╬════════════════════════╬═══════════════════════════╣ ║ null ║ ║ null ║ ║ boolean ║ java.lang.Boolean ║ java.lang.Boolean ║ ║ int ║ java.lang.Integer ║ java.lang.Integer ║ ║ long ║ java.lang.Long ║ java.lang.Long ║ ║ float ║ java.lang.Float ║ java.lang.Float ║ ║ double ║ java.lang.Double ║ java.lang.Double ║ ║ bytes ║ java.nio.ByteBuffer ║ java.nio.HeapByteBuffer ║ ║ string ║ java.lang.CharSequence ║ org.apache.avro.util.Utf8 ║ ║ record ║ *.GenericRecord ║ *.GenericData$Record ║ ║ enum ║ java.lang.CharSequence ║ *.GenericData$EnumSymbol ║ ║ array ║ java.util.Collection ║ *.GenericData$Array ║ ║ map ║ java.util.Map ║ java.util.HashMap ║ ║ fixed ║ *.GenericFixed ║ *.GenericData$Fixed ║ ╚═══════════╩════════════════════════╩═══════════════════════════╝ * == org.apache.avro.generic


In Avro 1.8.0, the enum type requires a GenericEnumSymbol. It no longer accepts CharSequence.

like image 55
jaco0646 Avatar answered Oct 12 '22 19:10

jaco0646