Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Kubernetes and MPI

Tags:

I want to run an MPI job on my Kubernetes cluster. The context is that I'm actually running a modern, nicely containerised app but part of the workload is a legacy MPI job which isn't going to be re-written anytime soon, and I'd like to fit it into a kubernetes "worldview" as much as possible.

One initial question: has anyone had any success in running MPI jobs on a kube cluster? I've seen Christian Kniep's work in getting MPI jobs to run in docker containers, but he's going down the docker swarm path (with peer discovery using consul running in each container) and I want to stick to kubernetes (which already knows the info of all the peers) and inject this information into the container from the outside. I do have full control over all the parts of the application, e.g. I can choose which MPI implementation to use.

I have a couple of ideas about how to proceed:

  1. fat containers containing slurm and the application code -> populate the slurm.conf with appropriate info about the peers at container startup -> use srun as the container entrypoint to start the jobs

  2. slimmer containers with only OpenMPI (no slurm) -> populate a rankfile in the container with info from outside (provided by kubernetes) -> use mpirun as the container entrypoint

  3. an even slimmer approach, where I basically "fake" the MPI runtime by setting a few environment variables (e.g. the OpenMPI ORTE ones) -> run the mpicc'd binary directly (where it'll find out about its peers through the env vars)

  4. some other option

  5. give up in despair

I know trying to mix "established" workflows like MPI with the "new hotness" of kubernetes and containers is a bit of an impedance mismatch, but I'm just looking for pointers/gotchas before I go too far down the wrong path. If nothing exists I'm happy to hack on some stuff and push it back upstream.

like image 919
Ben Avatar asked Jun 29 '16 07:06

Ben


People also ask

Does Kubernetes use MPI?

Kubernetes is effectively a general purpose scheduling system for containers. As many MPI-based workloads are already written on Linux, they can be easily containerized. The Kubeflow project has an early-stage operator that handles MPI applications.

What is Kubernetes used for?

Kubernetes, often abbreviated as “K8s”, orchestrates containerized applications to run on a cluster of hosts. The K8s system automates the deployment and management of cloud native applications using on-premises infrastructure or public cloud platforms.

What is Kubernetes service?

A Kubernetes service is a logical abstraction for a deployed group of pods in a cluster (which all perform the same function). Since pods are ephemeral, a service enables a group of pods, which provide specific functions (web services, image processing, etc.) to be assigned a name and unique IP address (clusterIP).

What is an MPI operator?

MPI Operator provides a common Custom Resource Definition (CRD) for defining a training job on a single CPU/GPU, multiple CPU/GPUs, and multiple nodes. It also implements a custom controller to manage the CRD, create dependent resources, and reconcile the desired states.


1 Answers

I tried MPI Jobs on Kubernetes for a few days and solved it by using dnsPolicy:None and dnsConfig (CustomDNS=true feature gate will be needed).

I pushed my manifests (as Helm chart) here.

https://github.com/everpeace/kube-openmpi

I hope it would help.

like image 178
everpeace Avatar answered Sep 28 '22 10:09

everpeace