Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Best practice for Rails App to run a long task in the background?

The Workling plugin allow you to schedule background tasks in a queue (they would perform the lengthy task). As of version 0.3 you can ask a worker for its status, this would allow you to display some nifty progress bars.

Another cool feature with Workling is that the asynchronous backend can be switched: you can used DelayedJobs, Spawn (classic fork), Starling...


I have a very large volume site that generates lots of large CSV files. These sometimes take several minutes to complete. I do the following:

  • I have a jobs table with details of the requested file. When the user requests a file, the request goes in that table and the user is taken to a "jobs status" page that lists all of their jobs.
  • I have a rake task that runs all outstanding jobs (a class method on the Job model).
  • I have a separate install of rails on another box that handles these jobs. This box just does jobs, and is not accessible to the outside world.
  • On this separate box, a cron job runs all outstanding jobs every 60 seconds, unless jobs are still running from the last invocation.
  • The user's job status page auto-refreshes to show the status of the job (which is updated by the jobs box as the job is started, running, then finished). Once the job is done, a link appears to the results file.

It may be too heavy-duty if you just plan to have one or two running at a time, but if you want to scale... :)


Calling ./script/runner in the background worked best for me. (I was also doing PDF generation.) It seems like the lowest common denominator, while also being the simplest to implement. Here's a write-up of my experience.


A simple solution that doesn't require any extra Gems or plugins would be to create a custom Rake task for handling the PDF generation. You could model the PDF generation process as a state machine with states such as submitted, processing and complete that are stored in the model's database table. The initial HTTP request to the Rails application would simply add a record to the table with a submitted state and return.

There would be a cron job that runs your custom Rake task as a separate Ruby process, so the main Rails application is unaffected. The Rake task can use ActiveRecord to find all the models that have the submitted state, change the state to processing and then generate the associated PDFs. Finally, it should set the state to complete. This enables your AJAX calls within the Rails app to monitor the state of the PDF generation process.

If you put your Rake task within your_rails_app/lib/tasks then it has access to the models within your Rails application. The skeleton of such a pdf_generator.rake would look like this:

namespace :pdfgenerator do
  desc 'Generates PDFs etc.'
  task :run => :environment do

    # Code goes here...
  end
end

As noted in the wiki, there are a few downsides to this approach. You'll be using cron to regularly create a fairly heavyweight Ruby process and the timing of your cron jobs would need careful tuning to ensure that each one has sufficient time to complete before the next one comes along. However, the approach is simple and should meet your needs.