Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can you append data to an existing Avro data file?

Tags:

avro

It seems like there isn't any way to append data to an existing Avro serialized file. I'd like to have multiple processes writing to a single avro file, but it looks like each time I open it, I start over from scratch. I don't want to read in all the data and then write it back out again.

Using the ruby example code I have tried "ab" and "ab+" as various settings, but no joy.

file = File.open('data.avr', 'wb')
schema = Avro::Schema.parse(SCHEMA)
writer = Avro::IO::DatumWriter.new(schema)
dw = Avro::DataFile::Writer.new(file, writer, schema)
dw << {"username" => "john", "age" => 25, "verified" => true}
dw << {"username" => "ryan", "age" => 23, "verified" => false}
dw.close
like image 804
Eric Pugh Avatar asked Jan 10 '12 16:01

Eric Pugh


1 Answers

I did figure out how to do it in Java using the appendTo method:

DatumWriter writer = new ReflectDatumWriter(Record.class);
DataFileWriter file = new DataFileWriter(writer);
file.setMeta("version", 1);
file.setMeta("creator", "ThinkBigAnalytics");
file.setCodec(CodecFactory.deflateCodec(5));
//file.create(schema, new File("/tmp/records"));
file.appendTo(new File("/tmp/records"));

However, I'd love to do it from Ruby.

like image 126
Eric Pugh Avatar answered Oct 13 '22 07:10

Eric Pugh