Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

polling with delayed_job

I have a process which takes generally a few seconds to complete so I'm trying to use delayed_job to handle it asynchronously. The job itself works fine, my question is how to go about polling the job to find out if it's done.

I can get an id from delayed_job by simply assigning it to a variable:

job = Available.delay.dosomething(:var => 1234)

+------+----------+----------+------------+------------+-------------+-----------+-----------+-----------+------------+-------------+
| id   | priority | attempts | handler    | last_error | run_at      | locked_at | failed_at | locked_by | created_at | updated_at  |
+------+----------+----------+------------+------------+-------------+-----------+-----------+-----------+------------+-------------+
| 4037 | 0        | 0        | --- !ru... |            | 2011-04-... |           |           |           | 2011-04... | 2011-04-... |
+------+----------+----------+------------+------------+-------------+-----------+-----------+-----------+------------+-------------+

But as soon as it completes the job it deletes it and searching for the completed record returns an error:

@job=Delayed::Job.find(4037)

ActiveRecord::RecordNotFound: Couldn't find Delayed::Backend::ActiveRecord::Job with ID=4037

@job= Delayed::Job.exists?(params[:id])

Should I bother to change this, and maybe postpone the deletion of complete records? I'm not sure how else I can get a notification of it's status. Or is polling a dead record as proof of completion ok? Anyone else face something similar?

like image 834
holden Avatar asked Apr 07 '11 13:04

holden


3 Answers

Let's start with the API. I'd like to have something like the following.

@available.working? # => true or false, so we know it's running
@available.finished? # => true or false, so we know it's finished (already ran)

Now let's write the job.

class AwesomeJob < Struct.new(:options)

  def perform
    do_something_with(options[:var])
  end

end

So far so good. We have a job. Now let's write logic that enqueues it. Since Available is the model responsible for this job, let's teach it how to start this job.

class Available < ActiveRecord::Base

  def start_working!
    Delayed::Job.enqueue(AwesomeJob.new(options))
  end

  def working?
    # not sure what to put here yet
  end

  def finished?
    # not sure what to put here yet
  end

end

So how do we know if the job is working or not? There are a few ways, but in rails it just feels right that when my model creates something, it's usually associated with that something. How do we associate? Using ids in database. Let's add a job_id on Available model.

While we're at it, how do we know that the job is not working because it already finished, or because it didn't start yet? One way is to actually check for what the job actually did. If it created a file, check if file exists. If it computed a value, check that result is written. Some jobs are not as easy to check though, since there may be no clear verifiable result of their work. For such case, you can use a flag or a timestamp in your model. Assuming this is our case, let's add a job_finished_at timestamp to distinguish a not yet ran job from an already finished one.

class AddJobIdToAvailable < ActiveRecord::Migration
  def self.up
    add_column :available, :job_id, :integer
    add_column :available, :job_finished_at, :datetime
  end

  def self.down
    remove_column :available, :job_id
    remove_column :available, :job_finished_at
  end
end

Alright. So now let's actually associate Available with its job as soon as we enqueue the job, by modifying the start_working! method.

def start_working!
  job = Delayed::Job.enqueue(AwesomeJob.new(options))
  update_attribute(:job_id, job.id)
end

Great. At this point I could've written belongs_to :job, but we don't really need that.

So now we know how to write the working? method, so easy.

def working?
  job_id.present?
end

But how do we mark the job finished? Nobody knows a job has finished better than the job itself. So let's pass available_id into the job (as one of the options) and use it in the job. For that we need to modify the start_working! method to pass the id.

def start_working!
  job = Delayed::Job.enqueue(AwesomeJob.new(options.merge(:available_id => id))
  update_attribute(:job_id, job.id)
end

And we should add the logic into the job to update our job_finished_at timestamp when it's done.

class AwesomeJob < Struct.new(:options)

  def perform
    available = Available.find(options[:available_id])
    do_something_with(options[:var])

    # Depending on whether you consider an error'ed job to be finished
    # you may want to put this under an ensure. This way the job
    # will be deemed finished even if it error'ed out.
    available.update_attribute(:job_finished_at, Time.current)
  end

end

With this code in place we know how to write our finished? method.

def finished?
  job_finished_at.present?
end

And we're done. Now we can simply poll against @available.working? and @available.finished? Also, you gain the convenience of knowing which exact job was created for your Available by checking @available.job_id. You can easily turn it into a real association by saying belongs_to :job.

like image 92
Max Chernyak Avatar answered Oct 24 '22 09:10

Max Chernyak


I ended up using a combination of Delayed_Job with an after(job) callback which populates a memcached object with the same ID as the job created. This way I minimize the number of times I hit the database asking for the status of the job, instead polling the memcached object. And it contains the entire object I need from the completed job, so I don't even have a roundtrip request. I got the idea from an article by the github guys who did pretty much the same thing.

https://github.com/blog/467-smart-js-polling

and used a jquery plugin for the polling, which polls less frequently, and gives up after a certain number of retries

https://github.com/jeremyw/jquery-smart-poll

Seems to work great.

 def after(job)
    prices = Room.prices.where("space_id = ? AND bookdate BETWEEN ? AND ?", space_id.to_i, date_from, date_to).to_a
    Rails.cache.fetch(job.id) do
      bed = Bed.new(:space_id => space_id, :date_from => date_from, :date_to => date_to, :prices => prices)
    end
  end
like image 15
holden Avatar answered Oct 24 '22 10:10

holden


I think that the best way would be to use the callbacks available in the delayed_job. These are: :success, :error and :after. so you can put some code in your model with the after:

class ToBeDelayed
  def perform
    # do something
  end

  def after(job)
    # do something
  end
end

Because if you insist of using the obj.delayed.method, then you'll have to monkey patch Delayed::PerformableMethod and add the after method there. IMHO it's far better than polling for some value which might be even backend specific (ActiveRecord vs. Mongoid, for instance).

like image 13
Roman Avatar answered Oct 24 '22 09:10

Roman