Hi,There is a topic about writing text data into multiple output directories in one spark job using MultipleTextOutputFormat
Write to multiple outputs by key Spark - one Spark job
I would ask if there is some similar way to write avro data to multiple directories
What I want is to write the data in avro file to different directory(based on the timestamp field, same day in the timestamp goes to the same directory)
The AvroMultipleOutputs class simplifies writing Avro output data to multiple outputs.
Case one: writing to additional outputs other than the job default output. Each additional output, or named output, may be configured with its own Schema and OutputFormat.
Case two: to write data to different files provided by user
AvroMultipleOutputs
supports counters, by default they are disabled. The counters group is the AvroMultipleOutputs
class name. The names of the counters are the same as the output name. These count the number of records written to each output name.
Also have a look at
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With