I defined three tasks T1
, T2
, and T3
, and then a task T4
as follows:
class T4(luigi.Task)
def requires(self):
return [T1(), T2(), T3()]
Is there a natural way to tell Luigi that I want these tasks T1
, T2
, and T3
to be executed in parallel?
Parallel tasks are split into subtasks that are assigned to multiple workers and then completed simultaneously. A worker system can carry out both parallel and concurrent tasks by working on multiple tasks at the same time while also breaking down each task into sub-tasks that are executed simultaneously.
By default, Luigi tasks run using the Luigi scheduler. To run one of your previous tasks using the Luigi scheduler omit the --local-scheduler argument from the command. Re-run the task from Step 3 using the following command: python -m luigi --module word-frequency GetTopBooks.
Luigi is a workflow management system to efficiently launch a group of tasks with defined dependencies between them. It is a Python based API that was developed by Spotify® to build and execute pipelines of Hadoop jobs, but it can also be used to create workflows with any external jobs written in R or Scala or Spark.
It depends on what dependencies T1, T2 and T3 have. If they haven't another task as a common dependency, you can just run your task specifying --workers=3
and Luigi will run each task in a separate worker.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With