I have a few basic questions on Dask:
As an edit: My application is that I want to parallelize a for loop either on my local machine or on a cluster (i.e. it should work on a cluster).
As a second edit: I think I am also somewhat unclear regarding the relation between Futures and delayed computations.
Thx
The Dask delayed function decorates your functions so that they operate lazily. Rather than executing your function immediately, it will defer execution, placing the function and its arguments into a task graph. delayed ([obj, name, pure, nout, traverse]) Wraps a function or object to produce a Delayed .
Dask supports a real-time task framework that extends Python's concurrent. futures interface. Dask futures reimplements most of the Python futures API, allowing you to scale your Python futures workflow across a Dask cluster with minimal code changes.
When we create a Client object it registers itself as the default Dask scheduler. All . compute() methods will automatically start using the distributed system. We can stop this behavior by using the set_as_default=False keyword argument when starting the Client.
1) Yup. If you're sending the data through a network, you have to have some way of asking the computer doing the computing for you how's that number-crunching coming along, and Futures represent more or less exactly that.
2) No. With Futures, you're executing the functions eagerly - spinning up the computations as soon as you can, then waiting for the results to come back (from another thread/process locally, or from some remote you've offloaded the job onto). The relevant abstraction here would be a Queque (Priority Queque, specifically).
3) For a Delayed instance, for instance, you could do some_delayed.dask, or for an Array, Array.dask; optionally wrap the whole thing in either dict() or vars(). I don't know for sure if it's reliably set up this way for every single API, though (I would assume so, but you know what they say about what assuming makes of the two of us...).
4) The simplest analogy would probably be: Delayed is essentially a fancy Python yield
wrapper over a function; Future is essentially a fancy async/await
wrapper over a function.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With