Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to define Avro union in java

I have defined an Avro schema composed of a record which contains a union of two (or more) different records such as:

{
  "type":"record",
  "name":"MyCompositeRecord",
  "fields":
  [
    {"name":"SomeCommonData","type":"string"},
    {"name":"MoreCommonData","type":"float"},
    {"name":"CompositeRecord","type":
      [
        {
          "type":"record",
          "name":"FirstOption",
          "fields":
          [
            {"name":"x","type":"string"},
            {"name":"y","type":"long"}
          ]
        },
        {
          "type":"record",
          "name":"SecondOption",
          "fields":
          [
            {"name":"z","type":"int"},
            {"name":"w","type":"float"},
            {"name":"m","type":"double"},
            {"name":"l","type":"boolean"}
          ]
        }
      ]
    }
  ]
}

It doesn't look very clear but I hope you get the idea: I have a record composed of some data ("SomeCommonData" and "MoreCommonData") and a union of two different types of records ("FirstOption" and "SecondOption"). At serialization/deserialization time I should be able to create either one of the two sub-records and serialize a "MyCompositeRecord".

I haven't tried generating code for the schema since I'm planning on using just generic records. However, I'm not sure if and how such generic records can be serialized. I can't find any example online. I'm going to use java to serialized/deserialize. I was able to create a writer/reader for the schema as follows:

Schema.Parser parser = new Schema.Parser();
Schema schema = parser.parse(COMPOSITE_SCHEMA);
DatumWriter<GenericRecord> writer = new GenericDatumWriter<>(schema);
DatumReader<GenericRecord> reader = new GenericDatumReader<>(schema);
GenericRecord datum = new GenericData.Record(schema);

Any ideas on how to proceed from here to actually build the record?

Thanks

like image 620
Giovanni Botta Avatar asked Nov 04 '22 00:11

Giovanni Botta


1 Answers

Basically for a union it is no different than setting any other field:

GenericRecord datum = new GenericData.Record(schema);

datum.set(1, data);

where 1 is the Union field number and data is the value being set.

If you look at getDefaultValue in AvroEditor - Helper, you will see the default values I use for each Avro Type. Arrays must implement GenericArray.

like image 97
Bruce Martin Avatar answered Nov 15 '22 00:11

Bruce Martin