I have a cluster with a high performance network (InfiniBand). However when I set up my Dask scheduler and workers, performance doesn't seem to be as fast as I would expect. How can I tell Dask to use this network?
Disclaimer: I'm just asking this question so that I can answer it. It has become a frequently asked question
You can launch a Dask cluster using mpirun or mpiexec and the dask-mpi command line tool. This depends on the mpi4py library. It only uses MPI to start the Dask cluster and not for inter-node communication.
Custom Computations It just runs Python functions. Whether or not those Python functions use a GPU is orthogonal to Dask. It will work regardless.
As of dask.distributed version 1.16.3 you can specify a network interface to the dask-scheduler
and dask-worker
executables using the --interface
keyword like the following:
dask-scheduler --interface ib0 --scheduler-file ~/my.cluster.yaml
dask-worker --interface ib0 --scheudler-file ~/my.cluster.yaml
In the code example above I have assumed that your infininband network interface is called ib0
. You can check this by asking your IT department or by inspecting the output of ifconfig
$ ifconfig
lo Link encap:Local Loopback # Localhost
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
eth0 Link encap:Ethernet HWaddr XX:XX:XX:XX:XX:XX # Ethernet
inet addr:192.168.0.101
...
ib0 Link encap:Infiniband # Fast InfiniBand
inet addr:172.42.0.101
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With