Can I split an Apache Avro schema across multiple files?

Tags:

avro

I can do,

{     "type": "record",     "name": "Foo",     "fields": [         {"name": "bar", "type": {             "type": "record",             "name": "Bar",             "fields": [ ]         }}     ] }

and that works fine, but supposing I want to split the schema up into two files such as:

{     "type": "record",     "name": "Foo",     "fields": [         {"name": "bar", "type": "Bar"}     ] }  {     "type": "record",     "name": "Bar",     "fields": [ ] }

Does Avro have the capability to do this?

480

asked Feb 03 '14 22:02

Owen

2 Answers

Yes, it's possible.

I've done that in my java project by defining common schema files in avro-maven-plugin Example:

search_result.avro:

{     "namespace": "com.myorg.other",     "type": "record",     "name": "SearchResult",     "fields": [         {"name": "type", "type": "SearchResultType"},         {"name": "keyWord",  "type": "string"},         {"name": "searchEngine", "type": "string"},         {"name": "position", "type": "int"},         {"name": "userAction", "type": "UserAction"}     ] }

search_suggest.avro:

{     "namespace": "com.myorg.other",     "type": "record",     "name": "SearchSuggest",     "fields": [         {"name": "suggest", "type": "string"},         {"name": "request",  "type": "string"},         {"name": "searchEngine", "type": "string"},         {"name": "position", "type": "int"},         {"name": "userAction", "type": "UserAction"},         {"name": "timestamp", "type": "long"}     ] }

user_action.avro:

{     "namespace": "com.myorg.other",     "type": "enum",     "name": "UserAction",     "symbols": ["S", "V", "C"] }

search_result_type.avro

{     "namespace": "com.myorg.other",     "type": "enum",     "name": "SearchResultType",     "symbols": ["O", "S", "A"] }

avro-maven-plugin configuration:

<plugin>     <groupId>org.apache.avro</groupId>     <artifactId>avro-maven-plugin</artifactId>     <version>1.7.4</version>     <executions>         <execution>             <phase>generate-sources</phase>             <goals>                 <goal>schema</goal>             </goals>             <configuration>                 <sourceDirectory>${project.basedir}/src/main/resources/avro</sourceDirectory>                 <outputDirectory>${project.basedir}/src/main/java/</outputDirectory>                 <includes>                     <include>**/*.avro</include>                 </includes>                 <imports>                     <import>${project.basedir}/src/main/resources/avro/user_action.avro</import>                     <import>${project.basedir}/src/main/resources/avro/search_result_type.avro</import>                 </imports>             </configuration>         </execution>     </executions> </plugin>

127

answered Sep 17 '22 21:09

AlexTiunov

You can also define multiple schemas inside of one file:

schemas.avsc:

[ {     "type": "record",     "name": "Bar",     "fields": [ ] }, {     "type": "record",     "name": "Foo",     "fields": [         {"name": "bar", "type": "Bar"}     ] } ]

If you want to reuse the schemas in multiple places this is not super nice but it improves readability and maintainability a lot in my opinion.

answered Sep 20 '22 21:09

Michael

Related questions
                            
                                How to mix record with map in Avro?
                            
                                Deserialize an Avro file with C#
                            
                                Encode an object with Avro to a byte array in Python
                            
                                Why is Spark performing worse when using Kryo serialization?
                            
                                Why we need Avro schema evolution
                            
                                Spark: Writing to Avro file
                            
                                Can I get a Scala case class definition from an Avro schema definition?
                            
                                How to serialize a Date using AVRO in Java
                            
                                Json String to Java Object Avro
                            
                                python Spark avro
                            
                                Does binary encoding of AVRO compress data?
                            
                                What's the reason behind ZigZag encoding in Protocol Buffers and Avro?
                            
                                Generic conversion from POJO to Avro Record
                            
                                Kafka schema registry not compatible in the same topic
                            
                                How to fix Expected start-union. Got VALUE_NUMBER_INT when converting JSON to Avro on the command line?
                            
                                What is the advantage of storing schema in avro?
                            
                                How to generate fields of type String instead of CharSequence using Avro?
                            
                                Generate Avro Schema from certain Java Object
                            
                                How to encode/decode Kafka messages using Avro binary encoder?
                            
                                Polymorphism and inheritance in Avro schemas

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With