In the web interface and in https://github.com/spotify/luigi/blob/master/luigi/task.py I can see that a Task can have "resources". There is also a placeholder function in a Task class called process_resources(), that just returns the empty dictionary that is the resources.
What is this mythical resources thing?
I haven't tested this, but it looks like an arbitrary value that can be used by the scheduler to determine whether to throttle jobs based on the values in the config. From the docs:
This section can contain arbitrary keys. Each of these specifies the amount of a global resource that the scheduler can allow workers to use. The scheduler will prevent running jobs with resources specified from exceeding the counts in this section. Unspecified resources are assumed to have limit 1. Example resources section for a configuration with 2 hive resources and 1 mysql resource:
[resources] hive: 2 mysql: 1
Note that it was not necessary to specify the 1 for mysql here, but it is good practice to do so when you have a fixed set of resources.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With