Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What are the scaling limits of Dask.distributed?

Are there any anecdotal cases of Dask.distributed deployments with hundreds of worker nodes? Is distributed meant to scale to a cluster of this size?

like image 778
bcollins Avatar asked Oct 26 '16 02:10

bcollins


People also ask

Is DASK distributed?

Dask. distributed is a centrally managed, distributed, dynamic task scheduler. The central dask-scheduler process coordinates the actions of several dask-worker processes spread across multiple machines and the concurrent requests of several clients.

How many partitions should I have DASK?

You should aim for partitions that have around 100MB of data each. Additionally, reducing partitions is very helpful just before shuffling, which creates n log(n) tasks relative to the number of partitions. DataFrames with less than 100 partitions are much easier to shuffle than DataFrames with tens of thousands.

How does DASK manage memory?

Dask. distributed stores the results of tasks in the distributed memory of the worker nodes. The central scheduler tracks all data on the cluster and determines when data should be freed. Completed results are usually cleared from memory as quickly as possible in order to make room for more computation.


1 Answers

Yes

The largest Dask.distributed cluster I've seen is around one thousand nodes. We could theoretically go larger, but not by a huge amount.

The current limit is that the scheduler incurs around a 200 microsecond overhead per task. This translates to about 5000 tasks per second. If each of your tasks take around one second then the scheduler can saturate around 5000 cores.

Historically we ran into other limitations like open file handle limits and such. These have all been cleaned up to the scale that we've seen (1000 nodes) and generally things are fine on Linux or OSX. Dask schedulers on Windows stop scaling in the low hundreds of nodes (though you can use a Linux scheduler with Windows workers). I would not be surprised to see other issues pop up as we scale out to 10k nodes.

In short, you probably don't want to use Dask to replace MPI workloads on your million core Big Iron SuperComputer or at Google Scale. Otherwise you're probably fine.

like image 131
MRocklin Avatar answered Oct 20 '22 01:10

MRocklin