I have two Docker images, one containing pandoc
(an utility to convert documents in different formats to many formats), and an other containing pdflatex
(from texlive
, to convert tex
files into pdf
). My goal here is to convert documents from md
to pdf
.
I can run each image separately :
# call pandoc inside my-pandoc-image (md -> tex)
docker run --rm \
-v $(pwd):/pandoc \
my-pandoc-image \
pandoc -s test.md -o test.tex
# call pdflatex inside my-texlive-image (tex -> pdf)
docker run --rm \
-v $(pwd):/texlive \
my-texlive-image \
pdflatex test.tex # generates test.pdf
But, in fact, what I want is to call pandoc
(from its container) directly to convert md
into pdf
, like this :
docker run --rm \
-v $(pwd):/pandoc \
my-pandoc-image \
pandoc -s test.md --latex-engine pdflatex -o test.pdf
This command does not work here, because pandoc
inside the container tries to call pdflatex
(that must be in $PATH
) to generate the pdf, but pdflatex
does not exist since it is not installed in the my-pandoc-image
.
In my case, pdflatex
is installed in the image my-texlive-image
.
So, from this example, my question is : Can a container A call an executable located on an other container B ?
I am pretty sure this is possible, because if I install pandoc
on my host (without pdflatex
), I can run pandoc -s test.md--latex-engine=pdflatex -o test.pdf
by simply aliasing the pdflatex
command with :
pdflatex() {
docker run --rm \
-v $(pwd):/texlive \
my-texlive-image \
pdflatex "$@"
}
Thus, when pdflatex
is called by pandoc
, a container starts and do the conversion.
But when using the 2 containers, how could I alias the pdflatex
command to simulate its existence on the container having only pandoc
?
I took a look at docker-compose
, since I have already used it to make 2 containers communicate (app communicating with a database). I even thought about ssh
-ing from container A to container B to call the pdflatex
command, but this is definitively not the right solution.
Finally, I also have built an image containing pandoc
+ pdflatex
(it worked because the two executables were on the same image), but I really want to keep the 2 images separately, since they could be used independently by other images.
A similar question is exposed here, as I understand the provided answer needs Docker to be installed on container A, and needs a docker socket binding (/var/run/docker.sock
) between host and container A. I don't think this is best practice, it seems like a hack that can create security issues.
If you are running more than one container, you can let your containers communicate with each other by attaching them to the same network. Docker creates virtual networks which let your containers talk to each other. In a network, a container has an IP address, and optionally a hostname.
Surprisingly or not, neither Docker nor Podman support exposing multiple containers on the same host's port right out of the box. Example: docker-compose failing scenario with "Service specifies a port on the host. If multiple containers for this service are created on a single host, the port will clash."
A Docker container image is a lightweight, standalone, executable package of software that includes everything needed to run an application: code, runtime, system tools, system libraries and settings.
The Docker platformDocker provides the ability to package and run an application in a loosely isolated environment called a container. The isolation and security allows you to run many containers simultaneously on a given host.
There are multiple solutions to your problem, I'll let you choose the one that suits you best. They are presented below, from the cleanest to the ugliest (in my opinion and regarding the best practices generally followed).
If you end up calling it often, it may be worth exposing pandoc as an (HTTP) API. Some images already do that, for example metal3d/pandoc-server (which I already used with success, but I'm sure you can find others).
In this case, you just run a container with pandoc
+ pdflatex
once and you're set!
Make 2 images : one with pandoc
only, and the other one with pandoc
+ pdflatex
, inheriting the first one with the FROM
directive in the Dockerfile
.
It will solve your concerns about size and still being able to run pandoc without having to fetch pdflatex
too. Then if you need to pull the image with pdflatex
, it will just be an extra layer, not the entire image.
You can also do it the other way, with a base image pdflatex
and another adding pandoc
to it if you find yourself using the pdflatex
image alone often and rarely using the pandoc
image without pdflatex
. You could also make 3 images, pandoc
, pdflatex
, and pdflatex + pandoc
, to cover every need you might have, but then you'll have at least one image that isn't linked in any way to the 2 others (can't heritate a "child" image), making it a bit harder to maintain.
my-pandoc-image
+ Docker socket mountThis is the solution that you mentionned at the end of your post, and which is probably the most generic and straightforward solution for calling other containerized commands, not taking your precise usecase of pandoc
+ pdflatex
into account.
Just add the docker client tu your image my-pandoc-image
and pass the Docker socket as volume at runtime using docker run -v /var/run/docker.sock:/var/run/docker.sock
. And if you're concerned is not being able to make pandoc
call docker run ...
instead of pdflatex
directly, just add a poor wrapper called pdflatex
in /usr/local/bin/
which will be responsible of doing the docker run
This one is probably the less clean I'll present here. You could try getting either the pandoc
binary in a pdflatex
container or the pdflatex
binary in a pandoc
container using --volumes-from
to keep everything packaged in its own Docker image. But honnestly, it's more of a duct tape than a real solution.
You can chose the solution that best fits your needs, but I would advise the first 2 and strongly discourage the last one.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With