I want to use Apache Arrow to send data from a Django backend to a Angular frontend. I want to use a dictionary of dataframes/tables as payload in messages. It's posssible with pyarrow to share data in this way between python microservices, but i cant find a way with the javascript implementation of arrow.
Is there a way to deserialize/serialize a dictionary with strings as keys and dataframes/tables as values in the javascript side with Arrow?
Yes, a variant of this is possible using the RecordBatchReader and RecordBatchWriter IPC primitives in both pyarrow and ArrowJS.
On the python side, you can serialize a Table to a buffer like this:
import pyarrow as pa
def serialize_table(table):
sink = pa.BufferOutputStream()
writer = pa.RecordBatchStreamWriter(sink, table.schema)
writer.write_table(table)
writer.close()
return sink.getvalue().to_pybytes()
# ...later, in your route handler:
bytes = serialize_table(create_your_arrow_table())
Then you can send the bytes in the response body. If you have multiple tables, you can concatenate the buffers from each as one large payload.
I'm not sure what functionality exists to write multipart/form-body responses in python, but that's probably the best way to craft the response if you want the tables to be sent with their names (or any other metadata you wish to include).
On the JavaScript side, you can read the the response either with Table.from()
(if you have just one table), or the RecordBatchReader
if you have more than one, or if you want to read each RecordBatch in a streaming fashion:
import { Table, RecordBatchReader } from 'apache-arrow'
// easy if you want to read the first (or only) table in the response
const table = await Table.from(fetch('/table'))
// or for mutliple tables on the same stream, or to read in a streaming fashion:
for await (const reader of RecordBatchReader.readAll(fetch('/table'))) {
// Buffer all batches into a table
const table = await Table.from(reader)
// Or process each batch as it's downloaded
for await (const batch of reader) {
}
}
You can see more examples of this in our tests for ArrowJS here: https://github.com/apache/arrow/blob/3eb07b7ed173e2ecf41d689b0780dd103df63a00/js/test/unit/ipc/writer/stream-writer-tests.ts#L40
You can also see some examples in a little fastify plugin I wrote for consuming and producing Arrow payloads in node: https://github.com/trxcllnt/fastify-arrow
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With