Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Massive scheduling in Ruby

I need a scheduler for large dynamic collections of tasks. At the moment I'm looking at resque-scheduler, rufus-scheduler, and clockwork. I'll be grateful for advice on choosing which one (or what alternative) to use.

Some details:

  • There is a large collection of tasks (up to 100K) to be periodically executed.
  • The shortest execution period is 1h.
  • New tasks may appear from time to time. Existing tasks may be changed or deleted.
  • Scheduling latency minimization is not mission-critical here (scalability and sustainability is most important).
  • Tasks execution is not a heavy operation, and could be easily paralleled.

Summarizing, I need something like cron for Ruby project that can handle a large, dynamically changing collection of tasks.

Update: I've spent a day experimenting with scheduling libraries, and now I'd like to briefly summarize newly obtained experience.

I've stopped my attention at Clockwork and resque-scheduler libraries, due these are more mature projects with more detailed documentation. Resque-scheduler is based on rufus-scheduler while Clockwork is inspired by it, both can be used for the solution I'm looking for.

Both are standalone services supposed to be running in separate process, that can handle virtually unlimited amount of tasks scheduled for single or recurrent execution. Tasks are executed within threads.

Clockwork pros:

  • It has an ability to load scheduled tasks from database (through ActiveRecord or any arbitrary source).
  • Also it can dynamically update scheduled tasks by polling data updates from the DB.

Clockwork cons:

  • DB polling is a potential bottleneck here.
  • Polling interval is 1 minute (plus the time to reschedule all tasks), which is a bit too slow.
  • Scheduled tasks addressing (to unschedule or change) is undocumented, that's why using this feature look like a hack to me.

I've implemented an alternative Manager class for Clockwork (this is a core part of the gem that controls scheduling) to allow scheduling control through ZeroMQ messages. So the main service in my project can send commands to the scheduler, like "run this each day", or "unschedule task #10", and the scheduler executes each request immediately.

I have less experience with resque-scheduler, but at this point it looks like a better solution.

resque-scheduler pros:

  • Redis-based persistence. The manual asserts that scheduled tasks could be rescued after service restart.
  • Dynamic scheduling with clean API. You just call Resque.remove_schedule(name) to drop a specific task.
  • Web UI. Not too important, but nice to have.

resque-scheduler:

  • It requires Redis to be installed.

May be something else will appear, after closer look, but there is nothing else at the moment.

That is what I have now. BTW, I've published a number of links to the scheduling-related Ruby gems on GitHub.

like image 536
Alex Musayev Avatar asked Jul 14 '14 20:07

Alex Musayev


1 Answers

  • Whenever (https://github.com/javan/whenever)
  • rufus-scheduler (https://github.com/jmettraux/rufus-scheduler)
  • Clockwork (https://github.com/tomykaira/clockwork)

are somehow pure schedulers. Whenever is backed by Crond, so it's solid (but jobs will get executed in distinct processes). Rufus-scheduler and Clockwork are similar, in Ruby process, schedulers (Clockwork was inspired by rufus-scheduler).

Resque-scheduler (https://github.com/resque/resque-scheduler) builds on top of Resque (task management) and rufus-scheduler (schedule management).

You should have a look at Sidekiq (http://sidekiq.org/) too. Look at https://www.google.com/?q=sidekiq%20scheduler#q=sidekiq+scheduler

So learn about Resque and Sidekiq, then look at the schedulers available for them. If there is nothing that suits you, look at the schedulers (Whenever, rufus-scheduler, Clockwork, ...) themselves, maybe you can build on top of them.

like image 96
jmettraux Avatar answered Nov 14 '22 03:11

jmettraux