Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to generate schema-less avro files using apache avro?

Tags:

java

apache

avro

I am using Apache avro for data serialization. Since, the data has a fixed schema I do not want the schema to be a part of serialized data. In the following example, schema is a part of the avro file "users.avro".

User user1 = new User();
user1.setName("Alyssa");
user1.setFavoriteNumber(256);
User user2 = new User("Ben", 7, "red");
User user3 = User.newBuilder()
         .setName("Charlie")
         .setFavoriteColor("blue")
         .setFavoriteNumber(null)
         .build();

// Serialize user1 and user2 to disk
File file = new File("users.avro");
DatumWriter<User> userDatumWriter = new SpecificDatumWriter<User>(User.class);
DataFileWriter<User> dataFileWriter = new DataFileWriter<User (userDatumWriter);
dataFileWriter.create(user1.getSchema(), new File("users.avro"));
dataFileWriter.append(user1);
dataFileWriter.append(user2);
dataFileWriter.append(user3);
dataFileWriter.close();

Can anyone please tell me how to store avro-files without schema embedded in it?

like image 897
mintra Avatar asked Mar 02 '15 11:03

mintra


1 Answers

Here you find a comprehensive how to in which I explain how to achieve the schema-less serialization using Apache Avro. A companion test campaign shows up some figures on the performance that you might expect.

The code is on GitHub: example and test classes show up how to use the Data Reader and Writer with a Stub class generated by Avro itself.

like image 163
Paolo Maresca Avatar answered Sep 21 '22 14:09

Paolo Maresca