I have a suspicion that some our active jobs are disappearing but I don't know why. Below is one I have found the evidence for it's disappearance, but not the reason why.
Our site makes use of an external cloud printing service. We kick the jobs off and then check their status. Having successfully created the remote cloud print, we create an active job to check the status immediately. If it's finished (successfully or otherwise), it's marked as such. If not then the check status job creates another one, with a slight delay. The delay increases each time.
One a status check today, the logs show that the wait reached 128 seconds. But the next status check did not occur, and there are no errors in the log either.
We use active job backed by delayed job. The code for the status check job is below. It can't see any flaw in the logic which would not result in either correctly collected status check or another attempt with a wait.
class CheckCloudPrintStatusJob < ApplicationJob
queue_as :default
def perform(cloud_print, count = 0)
cloud_print.update_status
unless cloud_print.finished?
count += 1
wait = 2**(count-1)
if count > 15
cloud_print.mark_as_failed
puts "~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~"
puts "~~~~~~~~~~~~~~~~~~ Cloud printing ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~"
puts "Cloud print ##{cloud_print.id} failed"
puts "Finally waited #{wait} seconds and then cancelled."
puts "~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~"
else
puts "~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~"
puts "~~~~~~~~~~~~~~~~~~ Cloud printing ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~"
puts "Checking status of cloud print ##{cloud_print.id}"
puts "Waiting #{wait} seconds and then retrying."
puts "~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~"
CheckCloudPrintStatusJob.set(wait: wait.seconds).perform_later(cloud_print, count)
end
end
end
end
Active Job is a framework for declaring jobs and making them run on a variety of queueing backends. These jobs can be everything from regularly scheduled clean-ups, to billing charges, to mailings. Anything that can be chopped up into small units of work and run in parallel, really.
The most simple way to check whether delayed_job is running or not, is to check at locked_by field. This field will contain the worker or process locking/processing the job. Running Delayed::Job. where('locked_by is not null') will give you some results, if there are jobs running.
In the background the server can process all background jobs one by one. Rails provides Active Job to process background jobs and making them run on a variety of queueing applications.
Correct, there is no flaw in the stated logic that would result in either correctly collected status check or another attempt with a wait.
I've verified that your job code behaves successfully beyond a 128-second wait with the following setup:
rails new
projectdelayed_job_active_record
added to the Gemfile
(running bundle install
)rails generate delayed_job:active_record
and rake db:migrate
to install gems and create the Delayed Job DB tableconfig.active_job.queue_adapter = :delayed_job
in config/application.rb
CloudPrint < ApplicationRecord
model with update_status
, finished?
and mark_as_failed
methods in app/models/cloud_print.rb
app/jobs/check_cloud_print_status_job.rb
CheckCloudPrintStatusJob.perform_later(CloudPrint.create)
via the Rails Console (bin/rails c
)Since the above sequence behaved correctly without any issue, you need to expand your search by providing a more complete and verifiable example that actually reproduces the problem. Either upload your entire Rails project into a GitHub repo once you've been able to reproduce your issue consistently, or investigate other aspects of your environment and project configuration. Here are some possibilities:
rake jobs:clear
)finished?
could have returned true
after update_status
was invoked, causing the final status check to not have been printed even though the processing finished successfully.N.B. - Delayed Job supports retrying failed jobs with a delay of 5 seconds + N ** 4
, where N
is the number of attempts, there's no need to re-implement this logic yourself. Just raise
an exception if cloud_print.finished?
is false, and you shouldn't need any other custom delay code:
class CheckCloudPrintStatusJob < ApplicationJob
queue_as :default
def perform(cloud_print)
raise 'Not ready' unless cloud_print.finished?
end
end
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With