Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Avro schema evolution

Tags:

avro

I have two questions:

  1. Is it possible to use the same reader and parse records that were written with two schemas that are compatible, e.g. Schema V2 only has an additional optional field compared to Schema V1 and I want the reader to understand both? I think the answer here is no, but if yes, how do I do that?

  2. I have tried writing a record with Schema V1 and reading it with Schema V2, but I get the following error:

    org.apache.avro.AvroTypeException: Found foo, expecting foo

I used avro-1.7.3 and:

   writer = new GenericDatumWriter<GenericData.Record>(SchemaV1);
   reader = new GenericDatumReader<GenericData.Record>(SchemaV2, SchemaV1);

Here are examples of the two schemas (I have tried adding a namespace as well, but no luck).

Schema V1:

{
"name": "foo",
"type": "record",
"fields": [{
    "name": "products",
    "type": {
        "type": "array",
        "items": {
            "name": "product",
            "type": "record",
            "fields": [{
                "name": "a1",
                "type": "string"
            }, {
                "name": "a2",
                "type": {"type": "fixed", "name": "a3", "size": 1}
            }, {
                "name": "a4",
                "type": "int"
            }, {
                "name": "a5",
                "type": "int"
            }]
        }
    }
}]
}

Schema V2:

{
"name": "foo",
"type": "record",
"fields": [{
    "name": "products",
    "type": {
        "type": "array",
        "items": {
            "name": "product",
            "type": "record",
            "fields": [{
                "name": "a1",
                "type": "string"
            }, {
                "name": "a2",
                "type": {"type": "fixed", "name": "a3", "size": 1}
            }, {
                "name": "a4",
                "type": "int"
            }, {
                "name": "a5",
                "type": "int"
            }]
        }
    }
},
{
            "name": "purchases",
            "type": ["null",{
                    "type": "array",
                    "items": {
                            "name": "purchase",
                            "type": "record",
                            "fields": [{
                                    "name": "a1",
                                    "type": "int"
                            }, {
                                    "name": "a2",
                                    "type": "int"
                            }]
                    }
            }]
}]
} 

Thanks in advance.

like image 377
magicalo Avatar asked Mar 11 '13 23:03

magicalo


People also ask

Does Avro support schema evolution?

Fortunately Thrift, Protobuf and Avro all support schema evolution: you can change the schema, you can have producers and consumers with different versions of the schema at the same time, and it all continues to work.

How does Avro handle schema evolution?

Avro is a serialization tool that stores binary data with its json schema at the top. The schema looks like this. Now my question is why we need evolution? I have read that we can use default in the schema for new fields; but if we add a new schema in the file, that earlier schema will be overwritten.

How does schema evolve?

Schema evolution is a feature that allows users to easily change a table's current schema to accommodate data that is changing over time. Most commonly, it's used when performing an append or overwrite operation, to automatically adapt the schema to include one or more new columns.

What is schema evolution in hive?

Schema evolution allows you to update the schema used to write new data while maintaining backwards compatibility with the schemas of your old data. Then you can read it all together as if all of the data has one schema.


1 Answers

I encountered the same issue. That might be a bug of avro, but you probably can work around by adding "default": null to the field of "purchase".

Check my blog for details: http://ben-tech.blogspot.com/2013/05/avro-schema-evolution.html

like image 121
Bewang Avatar answered Nov 02 '22 07:11

Bewang