Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

parallelizing tasks in Luigi Orchestrator

I defined three tasks T1, T2, and T3, and then a task T4 as follows:

class T4(luigi.Task)
    def requires(self):
        return [T1(), T2(), T3()]

Is there a natural way to tell Luigi that I want these tasks T1, T2, and T3 to be executed in parallel?

like image 856
sweeeeeet Avatar asked Dec 07 '15 05:12

sweeeeeet


People also ask

How do you parallelize tasks?

Parallel tasks are split into subtasks that are assigned to multiple workers and then completed simultaneously. A worker system can carry out both parallel and concurrent tasks by working on multiple tasks at the same time while also breaking down each task into sub-tasks that are executed simultaneously.

How do you use Luigi scheduler?

By default, Luigi tasks run using the Luigi scheduler. To run one of your previous tasks using the Luigi scheduler omit the --local-scheduler argument from the command. Re-run the task from Step 3 using the following command: python -m luigi --module word-frequency GetTopBooks.

What is a Luigi workflow?

Luigi is a workflow management system to efficiently launch a group of tasks with defined dependencies between them. It is a Python based API that was developed by Spotify® to build and execute pipelines of Hadoop jobs, but it can also be used to create workflows with any external jobs written in R or Scala or Spark.


1 Answers

It depends on what dependencies T1, T2 and T3 have. If they haven't another task as a common dependency, you can just run your task specifying --workers=3 and Luigi will run each task in a separate worker.

like image 116
matagus Avatar answered Sep 18 '22 21:09

matagus