Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to mark a sidekiq task/job for retry without raising an error?

I use a Sidekiq queue to process communications with an unreliable, 3rd party API. Since this API is often down for a couple minutes at a time and then back up again, Sidekiq has been handy. When a connection issue happens, an error is raised and Sidekiq throws the job back in the queue to be retried again later, after some time has passed.

I use NewRelic to not only help debug crashes, but also for monitoring. My problem is that this current methodology above creates errors in NewRelic. If the 3rd party API is down for more than a couple of minutes, the error count accumulates enough to cause notifications to send out through NewRelic.

What I'd like to do is only raise an error from my worker when a certain number of retries have occurred for a job. I'm using sidekiq_retries_exhausted to do this. My problem is that I'm not quite sure how to put jobs back in the queue after they have an error without raising an error.

Does Sidekiq provide any facilities to return a job to a queue, increment the number of retries for the job, and have it sit there until it's due to run again, as if an exception was raised in the worker class?

like image 930
YWCA Hello Avatar asked Feb 29 '16 23:02

YWCA Hello


People also ask

How do you retry a Sidekiq job?

Sidekiq will retry failures with an exponential backoff using the formula (retry_count ** 4) + 15 + (rand(30) * (retry_count + 1)) (i.e. 15, 16, 31, 96, 271, ... seconds + a random amount of time). It will perform 25 retries over approximately 21 days.

How do I manually run a Sidekiq job?

To run sidekiq, you will need to open a terminal, navigate to your application's directory, and start the sidekiq process, exactly as you would start a web server for the application itself. When the command executes you will see a message that sidekiq has started.

How do I monitor my Sidekiq queue?

Sidekiq queue metrics plugin is very easy to use. You just have to initialize the plugin in the sidekiq initializer and your job is done. Here is how the code will look like. Start sidekiq after enabling the sidekiq queue metrics plugin and you will see another table on the sidekiq UI as shown in the image below.


3 Answers

You raise a specific error and tell the error service to ignore errors of that type. For NewRelic:

https://docs.newrelic.com/docs/agents/ruby-agent/installation-configuration/ruby-agent-configuration#error_collector.ignore_errors

like image 125
Mike Perham Avatar answered Sep 24 '22 19:09

Mike Perham


Here is what I did to keep intentional retry errors out of AirBrake:

class TaskWorker
  include Sidekiq::Worker

  class RetryNotAnError < RuntimeError
  end

  def perform task_id
    task = Task.find(task_id)
    task.do_cool_stuff

    if task.finished?
      @log.debug "Task #{task_id} was successful."
      return false
    else
      @log.debug "Task #{task_id} will try again later."
      raise RetryNotAnError, task_id
    end
  end
end

Tell Airbrake to ignore it:

Airbrake.configure do |config|
  config.ignore << 'RetryNotAnError'
end

It's good to make your exception name OBVIOUSLY not an error (e.g. RetryLaterNotAnError), as it will still show up in logs and such, and you don't want to freak people out when they see a bunch of them.

ps. That said, I would really like to see Sidekiq to provide an explicit, errorless retry mechanism.

like image 37
David Hempy Avatar answered Sep 20 '22 19:09

David Hempy


If using Sidekiq Enterprise, one other option might be to utilize the optional set of additional error types that will then get treated as Sidekiq::Limiter::OverLimit violations.

For my purposes, I've used a new error class and then added it to the list in the config. Here are the notes from the sidekiq-ent code (not in the public sidekiq repo) on how to modify your config file:

    # An optional set of additional error types which would be
    # treated as a rate limit violation, so the job would automatically
    # be rescheduled as with Sidekiq::Limiter::OverLimit.
    #
    # Sidekiq::Limiter.errors << MyApp::TooMuch
    # Sidekiq::Limiter.errors = [Foo::Error, MyApp::Limited]

Inside the specific job you can specify the max_retries, or it will default to 20:

sidekiq_options max_limiter_retries: 10

Inside the job, I'll rescue the "expected" intermittent error that I'd rather not ignore completely and then raise the error I've added to the list, something like this:

rescue RestClient::RequestTimeout => e
  raise SidekiqSoftRetry.new(e.inspect)
end

Here's what that looks like in my initialization file-- and Mike Perham was kind enough to respond with the option to update the global retry limit.

class SidekiqSoftRetry < RuntimeError
end
Sidekiq::Limiter::DEFAULT_OPTIONS[:reschedule] = 10
Sidekiq::Limiter.configure do |config|
  config.errors.concat(
    [   
      SidekiqSoftRetry,
    ]
  )
end
like image 21
Scott Gratton Avatar answered Sep 24 '22 19:09

Scott Gratton