Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to send file in buffer from Python to Julia

I have a large Pandas DataFrame in Python that I would like to access in a Julia program (as a Julia DataFrames.DataFrame object). As I would like to avoid writing to disk for each file send from Python to Julia, it seems as though storing the DataFrame in an Apache Arrow/Feather file in a buffer and sending that via TCP from python to Julia is ideal.

I have tried extensively but cannot figure out how to

  1. Write Apache Arrow/Feather files to memory (not storage)
  2. Send them over TCP from python
  3. Access them from the TCP port in Julia

Thanks for your help.

like image 311
Jack N Avatar asked Sep 10 '25 22:09

Jack N


1 Answers

Hmmm, good question. I'm not sure using a TCP socket is necessarily the easiest, since you need one end to be the "server" socket and the other to be the client. So typically the TCP flow is: 1) server binds and listens to a port, 2) server calls to "accept" a new connection, 3) client calls "connect" on the port to initialize connection, 4) once server accepts, the connection is established, then server/client can write data to each other over connected socket.

I've had success doing something similar to what you've described by using mmapped files, though maybe you have a hard requirement to not touch disk at all. This works nicely though because both the python and Julia processes just "share" the mmapped file.

Another approach you could check out is what I setup to do "round trip" testing in the Arrow.jl Julia package: https://github.com/apache/arrow-julia/blob/main/test/pyarrow_roundtrip.jl. It's setup to use PyCall.jl from Julia to share the bytes between python and Julia.

Hope that helps!

like image 72
quinnj Avatar answered Sep 13 '25 13:09

quinnj