is it possible to perform distributed concurrent writes to parquet format?
And is it possible to read parquet files while they are being written?
If there are methods for concurrent read/writes I'd be interested to learn about.
Thanks in advance for you help.
I eventually had an answer from Parquet developers: answer is no to both questions:
Parquet writers are not thread-safe and files cannot be read or written by different readers or writers concurrently. Parquet doesn't expose flush/sync operations to the user (for good reason) so there isn't a way to reliably do this anyway.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With