What is the most efficient way to stream data between Docker containers

Tags:

I have a large number of bytes per second coming from a sensor device (e.g., video) that are being read and processed by a process in a Docker container.

I have a second Docker container that would like to read the processed byte stream (still a large number of bytes per second).

What is an efficient way to read this stream? Ideally I'd like to have the first container write to some sort of shared memory buffer that the second container can read from, but I don't think separate Docker containers can share memory. Perhaps there is some solution with a shared file pointer, with the file saved to an in-memory file system?

My goal is to maximize performance and minimize useless copies of data from one buffer to another as much as possible.

Edit: Would love to have solutions for both Linux and Windows. Similarly, I'm interested in finding solutions for doing this in C++ as well as python.

260

asked Jul 11 '18 22:07

eraoul

2 Answers

Create a fifo with mkfifo /tmp/myfifo. Share it with both containers: --volume /tmp/myfifo:/tmp/myfifo:rw

You can directly use it:

From container 1: echo foo >>/tmp/myfifo
In Container 2: read var </tmp/myfifo

Drawback: Container 1 is blocked until Container 2 reads the data and empties the buffer.

Avoid the blocking: In both containers, run in bash exec 3<>/tmp/myfifo.

From container 1: echo foo >&3
In Container 2: read var <&3 (or e.g. cat <&3)

This solution uses exec file descriptor handling from bash. I don't know how, but certainly it is possible with other languages, too.

182

answered Oct 18 '22 06:10

mviereck

Using simple TCP socket would be my first choice. Only if measurements show that we absolutely need to squeeze the last bit of performance from the system that I would fall back to or pipes or shared memory.

Going by the problem statement, the process seems to be bound by the local CPU/mem resources and that the limiting factors are not external services. In that case having both producer and consumer on the same machine (as docker containers) might bound the CPU resource before anything else - BUT I will first measure before acting.

Most of the effort in developing a code is spent in maintaining it. So I favor mainstream practices. TCP stack has rock solid foundations and it is as optimized for performance as humanly possible. Also it is lot more (completely?) portable across platforms and frameworks. Docker containers on same host when communicating over TCP do not hit wire. If some day the processes do hit resource limit, you can scale horizontally by splitting the producer and consumer across physical hosts - manually or say using Kubernetes. TCP will work seamlessly in that case. If you never gonna need that level of throughput, then you also wont need system-level sophistication in inter process communication.

Go by TCP.

answered Oct 18 '22 06:10

inquisitive

Related questions
                            
                                Can't exit from docker-compose logs -f [container name]
                            
                                Cannot connect to postgres server in docker
                            
                                How to cache node_modules on Docker build?
                            
                                Git repository setup for a Docker application consisting of multiple repositories
                            
                                Correctly keeping docker VSTS / Azure Devops build agent clean yet cached
                            
                                Change system date time in Docker containers without impacting host
                            
                                Running Visual Studio Remote Debugger in Windows Container (Docker managed)
                            
                                Cannot authenticate to Docker in Elastic Beanstalk through S3
                            
                                docker-compose: specify which interface will connect to each network (when having multiple networks)
                            
                                Eclipse IDE within docker
                            
                                Set docker image username at container creation time?
                            
                                Docker Bad owner or permissions on /root/.ssh/config
                            
                                Kubernetes, simple SpringBoot app OOMKilled
                            
                                Jenkins using Docker: How to run tests?
                            
                                Docker receiving multicast traffic
                            
                                Create environment variables for Kubernetes main container in Kubernetes Init container
                            
                                Vagrant - Docker provider vs. docker provisioner
                            
                                Xdebug breaks on access to class static property
                            
                                How to manage secrets in a Microservice / Container / Cloud environment?
                            
                                How to make Docker container see real user IP?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

What is the most efficient way to stream data between Docker containers

Tags:

docker

memory

video

buffer

shared

eraoul

People also ask

2 Answers

mviereck

inquisitive

Recent Activity

Donate For Us