Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Julia cluster using docker

Tags:

docker

julia

I am trying to connect to docker containers using the default SSHManager. These containers only have a running sshd, with public key authentication, and julia installed.

Here is my dockerfile:

FROM rastasheep/ubuntu-sshd
RUN apt-get update && apt-get install -y julia
RUN mkdir -p /root/.ssh
ADD id_rsa.pub /root/.ssh/authorized_keys

I am running the container using:

sudo docker run -d -p 3333:22 -it --name julia-sshd julia-sshd

And then in the host machine, using the julia repl, I get the following error:

julia> import Base:SSHManager
julia> addprocs(["root@localhost:3333"])
stdin: is not a tty
Worker 2 terminated.
ERROR (unhandled task failure): EOFError: read end of file
Master process (id 1) could not connect within 60.0 seconds.
exiting.

I have tested that I can connect to the container via ssh without password.

I have also tested that in julia repl I can add a regular machine with julia installed to the cluster and it works fine.

But I cannot get this two things working together. Any help or suggestions will be apreciated.

like image 752
torce Avatar asked Nov 29 '16 18:11

torce


People also ask

What does Docker do?

Docker is an open platform for developing, shipping, and running applications. Docker enables you to separate your applications from your infrastructure so you can deliver software quickly. With Docker, you can manage your infrastructure in the same ways you manage your applications.

What is Linux Docker?

Docker is an open source project that automates the deployment of applications inside Linux Containers, and provides the capability to package an application with its runtime dependencies into a container. It provides a Docker CLI command line tool for the lifecycle management of image-based containers.


1 Answers

I recommend you to also deploy the Master in a Docker container. It makes your environment easily and fully reproducible.

I'm working on a way of deploying Workers in Docker containers on-demand. i.e., the Master deployed in a container can deploy further DockerizedJuliaWorkers. It is similar to https://github.com/gsd-ufal/Infra.jl but assuming that Master and Workers run on the same host, which makes things not so hard.

It is an on-going work and I plan to finish next weeks. In a nutshell:

1) You'll need a simple DockerBackend and a wrapper to transparently run containers, set up SSH, and call addprocs with all the low-level parameters (i.e., the DockerizedJuliaWorker.jl file):

https://github.com/NaelsonDouglas/DistributedMachineLearningThesis/tree/master/src/docker

2) Read here how to build the Docker image (Dockerfile is included):

https://github.com/NaelsonDouglas/DistributedMachineLearningThesis

Please tell me if you have any suggestion on how to improve it.

Best,

André Lage.

like image 155
André Lage Avatar answered Sep 29 '22 08:09

André Lage