Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

In Avro, is there any difference between calling toString() on a GenericRecord and using the JSONEncoder?

Tags:

java

json

avro

I have some Avro data as GenericRecords in Java that I want to convert to JSON, and I notice there are two ways to do this: one involves using the JsonEncoder, and the other involves simply calling toString() on the GenericRecord.

After some brief experimentation, both approaches seem to produce equivalent results, and the resulting JSON string can be converted back into Avro using the JsonDecoder in either case. So, my question is:

Is there any functional difference between the two, and is there any reason to use one over the other?

I'm using Avro 1.7.7.

like image 817
alphaloop Avatar asked Oct 08 '14 00:10

alphaloop


1 Answers

After some further testing a look at the Avro source, it seems that the toString() method on GenericRecord is implemented by GenericData.Record.toString(), which calls GenericData.toString(). The javadoc on this method states that it's supposed to provide a valid JSON representation of the record, which it sort of does.

However, it differs in its implementation from the JsonEncoder, in that the JsonEncoder makes use of the Jackson libraries, and pays closer attention to the Avro schema. The GenericRecord.toString() method simply walks the record and builds the JSON representation using a StringBuilder, and doesn't pay such close attention to the Avro schema.

This means there are cases when calling toString() will produce a JSON representation that can't be deserialized using the JSONDecoder, for example in cases where the schema contains unions.

Based on this is looks like the toString() method is a simple and convenient way to get a human-readable representation of the record, but is unreliable as a way to serialize the data according to the schema.

like image 101
alphaloop Avatar answered Oct 24 '22 10:10

alphaloop