Given example schema contains a field which is union of null and string,
{
"type":"record",
"name":"DataFlowEntity",
"namespace":"org.sdf.manage.commons.server",
"fields":
[
{"name":"dataTypeGroupName","type":["null","string"]},
{"name":"dataTypeName","type":"string"},
{"name":"dataSchemaVersion","type":"string"}
]
}
I want to convert following json object,
{
"dataTypeGroupName": "dg_1",
"dataTypeName": "dt_1",
"dataSchemaVersion": "1"
}
into an avro object corresponding to above schema. I tried with Avro's JsonDecoder with code snppet described below,
String dataFlowEntity = "{\"dataTypeGroupName\": \"dg_1\", \"dataTypeName\": \"dt_1\", \"dataSchemaVersion\": \"1\"}";
Schema schema = DataFlowEntity.SCHEMA$;
InputStream inputStream = new ByteArrayInputStream(dataFlowEntity.getBytes());
DataInputStream dInputStream = new DataInputStream(inputStream);
Decoder decoder = DecoderFactory.get().jsonDecoder(schema, dInputStream);
DatumReader<DataFlowEntity> datumReader = new GenericDatumReader<DataFlowEntity>(schema);
DataFlowEntity dataFlowEntityObject = DataFlowEntity.newBuilder().build();
dataFlowEntityObject = datumReader.read(null, decoder);
It fails with exception,
threw exception [org.apache.avro.AvroRuntimeException: org.apache.avro.AvroRuntimeException: Field dataTypeGroupName type:UNION pos:0 not set and has no default value] with root cause
org.apache.avro.AvroRuntimeException: Field dataTypeGroupName type:UNION pos:0 not set and has no default value
at org.apache.avro.generic.GenericData.getDefaultValue(GenericData.java:874)
at org.apache.avro.data.RecordBuilderBase.defaultValue(RecordBuilderBase.java:135)
If using node.js is an option, you can use avsc
to do the conversion for you. Calling clone
with wrapUnions
set will automatically wrap values into the first union branch they match.
Using your example:
var avsc = require('avsc');
var type = avsc.parse({
"type":"record",
"name":"DataFlowEntity",
"namespace":"org.sdf.manage.commons.server",
"fields": [
{"name":"dataTypeGroupName","type":["null","string"]},
{"name":"dataTypeName","type":"string"},
{"name":"dataSchemaVersion","type":"string"}
]
}, {wrapUnions: true});
var invalidRecord = {
"dataTypeGroupName": "dg_1",
"dataTypeName": "dt_1",
"dataSchemaVersion": "1"
};
var validRecord = type.clone(invalidRecord, {wrapUnions: true});
// == {
// "dataTypeGroupName":{"string":"dg_1"},
// "dataTypeName":"dt_1",
// "dataSchemaVersion":"1"
// }
Check this project out: https://github.com/allegro/hermes/pull/749/files
You are interested in the JsonAvroConverter. It de-serializes from json (without union types) to Avro generated objects (that have union types). Actually, it gets from the schema of types on the union and tries them one by one. It works excellent in our case.
This is doing the job: https://github.com/allegro/json-avro-converter/blob/master/converter/src/main/java/tech/allegro/schema/json2avro/converter/JsonGenericRecordReader.java
Regards!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With