In the API, there is a way to restart all workers and to shutdown the client completely, but I see no way to stop all workers while keeping the client unchanged. Is there a way to do this that I cannot find or is it a feature that doesn't exist ?
When we create a Client object it registers itself as the default Dask scheduler. All . compute() methods will automatically start using the distributed system. We can stop this behavior by using the set_as_default=False keyword argument when starting the Client.
The Dask Scheduler The Scheduler acts as a middle layer between the client and the workers, instructing workers to execute the actual computations requested by the client. It also helps the workers coordinate with each other, deciding who should do which tasks.
Worker node in a Dask distributed cluster. Workers perform two functions: Serve data from a local dictionary. Perform computation on that data and on data from peers.
The Client connects users to a Dask cluster. It provides an asynchronous user interface around functions and futures. This class resembles executors in concurrent. futures but also allows Future objects within submit/map calls.
This seems like a feature that does not exist, but is nevertheless doable using the current code. You can use run_on_scheduler to interact with the methods of the scheduler itself.
workers = list(c.scheduler_info()['workers'])
c.run_on_scheduler(lambda dask_scheduler=None:
dask_scheduler.retire_workers(workers, close_workers=True))
where c
is a Client, and we call retire_workers to gracefully ask each worker to exit.
There are probably other ways to achieve this. Note that the scheduler remains running in this case - it was not clear from the question if that is desired or not.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With