I have a Snakemake workflow that I've been using to train DL TensorFlow models. At a high level there are a few longish-running jobs (model training) that can be run in parallel. I would like to run these on the cloud and dask-cloudprovider
seems like a promising option since I can leverage GPU's easily on ECS. To do this, though, would I have to rewrite my workflow using the Dask functions (maybe dask delayed
)? Or is there some way to get Snakemake to use Dask?
If you do a web search for "dask snakemake" you'll find a Github issue from 2017 that you might want to read through. It's certainly possible, but someone would need to write the integration.
You may also want to try Dask's integration with Airflow, or, perhaps a bit more modern, the Prefect library.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With