I can do,
{ "type": "record", "name": "Foo", "fields": [ {"name": "bar", "type": { "type": "record", "name": "Bar", "fields": [ ] }} ] }
and that works fine, but supposing I want to split the schema up into two files such as:
{ "type": "record", "name": "Foo", "fields": [ {"name": "bar", "type": "Bar"} ] } { "type": "record", "name": "Bar", "fields": [ ] }
Does Avro have the capability to do this?
Avro uses a schema to structure the data that is being encoded. It has two different types of schema languages; one for human editing (Avro IDL) and another which is more machine-readable based on JSON.
Avro serializer/deserializers operate on fields in the order they are declared. Producers and Consumers must be on a compatible schema including the field order. Do not change the order of AVRO fields. All Producers and Consumers are must be updated at the same time if you change the field order.
Yes, it's possible.
I've done that in my java project by defining common schema files in avro-maven-plugin Example:
search_result.avro:
{ "namespace": "com.myorg.other", "type": "record", "name": "SearchResult", "fields": [ {"name": "type", "type": "SearchResultType"}, {"name": "keyWord", "type": "string"}, {"name": "searchEngine", "type": "string"}, {"name": "position", "type": "int"}, {"name": "userAction", "type": "UserAction"} ] }
search_suggest.avro:
{ "namespace": "com.myorg.other", "type": "record", "name": "SearchSuggest", "fields": [ {"name": "suggest", "type": "string"}, {"name": "request", "type": "string"}, {"name": "searchEngine", "type": "string"}, {"name": "position", "type": "int"}, {"name": "userAction", "type": "UserAction"}, {"name": "timestamp", "type": "long"} ] }
user_action.avro:
{ "namespace": "com.myorg.other", "type": "enum", "name": "UserAction", "symbols": ["S", "V", "C"] }
search_result_type.avro
{ "namespace": "com.myorg.other", "type": "enum", "name": "SearchResultType", "symbols": ["O", "S", "A"] }
avro-maven-plugin configuration:
<plugin> <groupId>org.apache.avro</groupId> <artifactId>avro-maven-plugin</artifactId> <version>1.7.4</version> <executions> <execution> <phase>generate-sources</phase> <goals> <goal>schema</goal> </goals> <configuration> <sourceDirectory>${project.basedir}/src/main/resources/avro</sourceDirectory> <outputDirectory>${project.basedir}/src/main/java/</outputDirectory> <includes> <include>**/*.avro</include> </includes> <imports> <import>${project.basedir}/src/main/resources/avro/user_action.avro</import> <import>${project.basedir}/src/main/resources/avro/search_result_type.avro</import> </imports> </configuration> </execution> </executions> </plugin>
You can also define multiple schemas inside of one file:
schemas.avsc:
[ { "type": "record", "name": "Bar", "fields": [ ] }, { "type": "record", "name": "Foo", "fields": [ {"name": "bar", "type": "Bar"} ] } ]
If you want to reuse the schemas in multiple places this is not super nice but it improves readability and maintainability a lot in my opinion.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With