I'm trying to get Python to parse Avro schemas such as the following...
from avro import schema
mySchema = """
{
"name": "person",
"type": "record",
"fields": [
{"name": "firstname", "type": "string"},
{"name": "lastname", "type": "string"},
{
"name": "address",
"type": "record",
"fields": [
{"name": "streetaddress", "type": "string"},
{"name": "city", "type": "string"}
]
}
]
}"""
parsedSchema = schema.parse(mySchema)
...and I get the following exception:
avro.schema.SchemaParseException: Type property "record" not a valid Avro schema: Could not make an Avro Schema object from record.
What am I doing wrong?
Creating Avro Schemastype − This field comes under the document as well as the under the field named fields. In case of document, it shows the type of the document, generally a record because there are multiple fields. When it is field, the type describes data type.
Avro has a schema-based system. A language-independent schema is associated with its read and write operations. Avro serializes the data which has a built-in schema. Avro serializes the data into a compact binary format, which can be deserialized by any application.
Avro is used to define the data schema for a record's value. This schema describes the fields allowed in the value, along with their data types. You apply a schema to the value portion of an Oracle NoSQL Database record using Avro bindings.
Avro serializer/deserializers operate on fields in the order they are declared. Producers and Consumers must be on a compatible schema including the field order. Do not change the order of AVRO fields. All Producers and Consumers are must be updated at the same time if you change the field order.
According to other sources on the web I would rewrite your second address definition:
mySchema = """
{
"name": "person",
"type": "record",
"fields": [
{"name": "firstname", "type": "string"},
{"name": "lastname", "type": "string"},
{
"name": "address",
"type": {
"type" : "record",
"name" : "AddressUSRecord",
"fields" : [
{"name": "streetaddress", "type": "string"},
{"name": "city", "type": "string"}
]
}
}
]
}"""
Every time we provide the type as named type, the field needs to be given as:
"name":"some_name",
"type": {
"name":"CodeClassName",
"type":"record/enum/array"
}
However, if the named type is union, then we do not need an extra type field and should be usable as:
"name":"some_name",
"type": [{
"name":"CodeClassName1",
"type":"record",
"fields": ...
},
{
"name":"CodeClassName2",
"type":"record",
"fields": ...
}]
Hope this clarifies further!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With