Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I use an InfiniBand network with Dask?

I have a cluster with a high performance network (InfiniBand). However when I set up my Dask scheduler and workers, performance doesn't seem to be as fast as I would expect. How can I tell Dask to use this network?

Disclaimer: I'm just asking this question so that I can answer it. It has become a frequently asked question

like image 333
MRocklin Avatar asked May 09 '17 23:05

MRocklin


People also ask

Does DASK use MPI?

You can launch a Dask cluster using mpirun or mpiexec and the dask-mpi command line tool. This depends on the mpi4py library. It only uses MPI to start the Dask cluster and not for inter-node communication.

Can DASK run on GPU?

Custom Computations It just runs Python functions. Whether or not those Python functions use a GPU is orthogonal to Dask. It will work regardless.


1 Answers

As of dask.distributed version 1.16.3 you can specify a network interface to the dask-scheduler and dask-worker executables using the --interface keyword like the following:

dask-scheduler --interface ib0 --scheduler-file ~/my.cluster.yaml
dask-worker --interface ib0 --scheudler-file ~/my.cluster.yaml

In the code example above I have assumed that your infininband network interface is called ib0. You can check this by asking your IT department or by inspecting the output of ifconfig

$ ifconfig
lo          Link encap:Local Loopback                       # Localhost
            inet addr:127.0.0.1  Mask:255.0.0.0
            inet6 addr: ::1/128 Scope:Host
eth0        Link encap:Ethernet  HWaddr XX:XX:XX:XX:XX:XX   # Ethernet
            inet addr:192.168.0.101
            ...
ib0         Link encap:Infiniband                           # Fast InfiniBand
            inet addr:172.42.0.101
like image 163
MRocklin Avatar answered Sep 23 '22 11:09

MRocklin