Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Dealing with exception handling and re-queueing in RQ on Heroku

I have a website running on Heroku in Python, and I have a worker up as a background process to handle tasks that I don't want to block webpage delivery and therefore are inappropriate for the web dynos. For this, I've set up a queue using rq and redis.

In my process, occasionally, custom exceptions might arise. For a specific subset of these, rather than allow the job to go straight to the 'failed' queue, I want to requeue it a few times. I've been looking at the exception handlers page on the rq homepage, and I'm unclear on a few things. In particular, it describes the following way to write an exception handler:

def my_handler(job, exc_type, exc_value, traceback):
    # do custom things here
    # for example, write the exception info to a DB
    ...

Right now, I'm planning to do something along the lines of:

   from rq import requeue_job
   def my_handler(job, exc_type, exc_value, traceback):
        if exec_type == "MyCustomError":
           job.meta['MyErrorCount'] += 1
           job.save()

           if job.meta['MyErrorCount'] >= 10:
               return True
           else:
               requeue_job(job.id)
               return False

Questions:

  • What kinds of objects are exc_type, exc_value, and traceback? (e.g., is the line if exec_type == "MyCustomError" at all correct?)
  • Will my error handler effectively detect if it's a specific error, requeue those jobs until it fails 10 times, and then let it fall to failed? Will it also let all other errors fall to failed?
like image 716
jdotjdot Avatar asked Oct 08 '12 00:10

jdotjdot


1 Answers

Here’s my solution

queues = []

def retry_handler(job, exc_type, exc_value, traceback):
    # Returning True moves the job to the failed queue (or continue to
    # the next handler)

    job.meta.setdefault('failures', 1)
    job.meta['failures'] += 1
    if job.meta['failures'] > 3 or isinstance(exc_type, (LookupError, CorruptImageError)):
        job.save()
        return True

    job.status = Status.QUEUED
    for queue_ in queues:
        if queue_.name == job.origin:
            queue_.enqueue_job(job, timeout=job.timeout)
            break
    else:
        return True  # Queue has disappeared, fail job

    return False  # Job is handled. Stop the handler chain.

queues.append(Queue(exc_handler=retry_handler))

I decided to retry all errors three times unless a certain known exception type was encountered. This allows me to respect failures that are understood, like if a user was deleted after the job was created but before the job was executed, or in the case of an image resize job the image provided is no longer found (HTTP 404) or not in a readable format (basically whenever I know the code will never handle the job).

To answer your question: exc_type is the class, exc_value is the exception instance. traceback is useful for logging. If you care about this, check out Sentry. Workers are automatically configured with a Sentry error handler if run with SENTRY_DSN in the context. Much cleaner than polluting your own db with error logs.

like image 151
Jökull Avatar answered Oct 24 '22 06:10

Jökull