We have a requirement where we need to appednd ORC
files. I tried to google it but no result. Also org.apache.hadoop.hive.ql.io.orc.WriterImpl
of ORC
do not have the append API. Is there anyway to append the ORC files? (More specifically using JAVA)
ORC data files are subdivised in independent stripes; each stripe be created in a single atomic step. See the official documentation for details.
I don't believe you can directly append to an existing file on-the-fly. That would mean leaving a corrupt stripe (hence a corrupt file) in case of a job crash while writing.
But you can
orc.stripe.size
property) per reducerIf you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With