I defined two versions of a record in two separate AVCS schema files. I used the namespace to distinguish versions SimpleV1.avsc
{
"type" : "record",
"name" : "Simple",
"namespace" : "test.simple.v1",
"fields" : [
{
"name" : "name",
"type" : "string"
},
{
"name" : "status",
"type" : {
"type" : "enum",
"name" : "Status",
"symbols" : [ "ON", "OFF" ]
},
"default" : "ON"
}
]
}
Example JSON
{"name":"A","status":"ON"}
Version 2 just has an additional description field with default value.
SimpleV2.avsc
{
"type" : "record",
"name" : "Simple",
"namespace" : "test.simple.v2",
"fields" : [
{
"name" : "name",
"type" : "string"
},
{
"name" : "description",
"type" : "string",
"default" : ""
},
{
"name" : "status",
"type" : {
"type" : "enum",
"name" : "Status",
"symbols" : [ "ON", "OFF" ]
},
"default" : "ON"
}
]
}
Example JSON
{"name":"B","description":"b","status":"ON"}
Both schemas were serialized to Java classes. In my example I was going to test backward compatibility. A record written by V1 shall be read by a reader using V2. I wanted to see that default values are inserted. This is working as long as I do not use enums.
public class EnumEvolutionExample {
public static void main(String[] args) throws IOException {
Schema schemaV1 = new org.apache.avro.Schema.Parser().parse(new File("./src/main/resources/SimpleV1.avsc"));
//works as well
//Schema schemaV1 = test.simple.v1.Simple.getClassSchema();
Schema schemaV2 = new org.apache.avro.Schema.Parser().parse(new File("./src/main/resources/SimpleV2.avsc"));
test.simple.v1.Simple simpleV1 = test.simple.v1.Simple.newBuilder()
.setName("A")
.setStatus(test.simple.v1.Status.ON)
.build();
SchemaPairCompatibility schemaCompatibility = SchemaCompatibility.checkReaderWriterCompatibility(
schemaV2,
schemaV1);
//Checks that writing v1 and reading v2 schemas is compatible
Assert.assertEquals(SchemaCompatibilityType.COMPATIBLE, schemaCompatibility.getType());
byte[] binaryV1 = serealizeBinary(simpleV1);
//Crashes with: AvroTypeException: Found test.simple.v1.Status, expecting test.simple.v2.Status
test.simple.v2.Simple v2 = deSerealizeBinary(binaryV1, new test.simple.v2.Simple(), schemaV1);
}
public static byte[] serealizeBinary(SpecificRecord record) {
DatumWriter<SpecificRecord> writer = new SpecificDatumWriter<>(record.getSchema());
byte[] data = new byte[0];
ByteArrayOutputStream stream = new ByteArrayOutputStream();
Encoder binaryEncoder = EncoderFactory.get()
.binaryEncoder(stream, null);
try {
writer.write(record, binaryEncoder);
binaryEncoder.flush();
data = stream.toByteArray();
} catch (IOException e) {
System.out.println("Serialization error " + e.getMessage());
}
return data;
}
public static <T extends SpecificRecord> T deSerealizeBinary(byte[] data, T reuse, Schema writer) {
Decoder decoder = DecoderFactory.get().binaryDecoder(data, null);
DatumReader<T> datumReader = new SpecificDatumReader<>(writer, reuse.getSchema());
try {
T datum = datumReader.read(null, decoder);
return datum;
} catch (IOException e) {
System.out.println("Deserialization error" + e.getMessage());
}
return null;
}
}
The checkReaderWriterCompatibility method confirms that schemas are compatible. But when I deserialize I’m getting the following exception
Exception in thread "main" org.apache.avro.AvroTypeException: Found test.simple.v1.Status, expecting test.simple.v2.Status
at org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:309)
at org.apache.avro.io.parsing.Parser.advance(Parser.java:86)
at org.apache.avro.io.ResolvingDecoder.readEnum(ResolvingDecoder.java:260)
at org.apache.avro.generic.GenericDatumReader.readEnum(GenericDatumReader.java:267)
at org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:181)
at org.apache.avro.specific.SpecificDatumReader.readField(SpecificDatumReader.java:136)
at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:247)
at org.apache.avro.specific.SpecificDatumReader.readRecord(SpecificDatumReader.java:123)
at org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:179)
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:160)
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:153)
at test.EnumEvolutionExample.deSerealizeBinary(EnumEvolutionExample.java:70)
at test.EnumEvolutionExample.main(EnumEvolutionExample.java:45)
I don’t understand why Avro thinks it got a v1.Status. Namespaces are not part of the encoding. Is this a bug or has anyone an idea how get that running?
Try adding an @aliases.
For example:
v1
{
"type" : "record",
"name" : "Simple",
"namespace" : "test.simple.v1",
"fields" : [
{
"name" : "name",
"type" : "string"
},
{
"name" : "status",
"type" : {
"type" : "enum",
"name" : "Status",
"symbols" : [ "ON", "OFF" ]
},
"default" : "ON"
}
]
}
v2
{
"type" : "record",
"name" : "Simple",
"namespace" : "test.simple.v2",
"fields" : [
{
"name" : "name",
"type" : "string"
},
{
"name" : "description",
"type" : "string",
"default" : ""
},
{
"name" : "status",
"type" : {
"type" : "enum",
"name" : "Status",
"aliases" : [ "test.simple.v1.Status" ]
"symbols" : [ "ON", "OFF" ]
},
"default" : "ON"
}
]
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With