Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Running "unique" tasks with celery

I use celery to update RSS feeds in my news aggregation site. I use one @task for each feed, and things seem to work nicely.

There's a detail that I'm not sure to handle well though: all feeds are updated once every minute with a @periodic_task, but what if a feed is still updating from the last periodic task when a new one is started ? (for example if the feed is really slow, or offline and the task is held in a retry loop)

Currently I store tasks results and check their status like this:

import socket from datetime import timedelta from celery.decorators import task, periodic_task from aggregator.models import Feed   _results = {}   @periodic_task(run_every=timedelta(minutes=1)) def fetch_articles():     for feed in Feed.objects.all():         if feed.pk in _results:             if not _results[feed.pk].ready():                 # The task is not finished yet                 continue         _results[feed.pk] = update_feed.delay(feed)   @task() def update_feed(feed):     try:         feed.fetch_articles()     except socket.error, exc:         update_feed.retry(args=[feed], exc=exc) 

Maybe there is a more sophisticated/robust way of achieving the same result using some celery mechanism that I missed ?

like image 950
Luper Rouch Avatar asked Nov 04 '10 10:11

Luper Rouch


People also ask

Is celery task ID unique?

Answer: Yes, but make sure it's unique, as the behavior for two tasks existing with the same id is undefined.

Can a celery task call another task?

To answer your opening questions: As of version 2.0, Celery provides an easy way to start tasks from other tasks. What you are calling "secondary tasks" are what it calls "subtasks".


2 Answers

Based on MattH's answer, you could use a decorator like this:

def single_instance_task(timeout):     def task_exc(func):         @functools.wraps(func)         def wrapper(*args, **kwargs):             lock_id = "celery-single-instance-" + func.__name__             acquire_lock = lambda: cache.add(lock_id, "true", timeout)             release_lock = lambda: cache.delete(lock_id)             if acquire_lock():                 try:                     func(*args, **kwargs)                 finally:                     release_lock()         return wrapper     return task_exc 

then, use it like so...

@periodic_task(run_every=timedelta(minutes=1)) @single_instance_task(60*10) def fetch_articles()     yada yada... 
like image 142
SteveJ Avatar answered Sep 21 '22 17:09

SteveJ


From the official documentation: Ensuring a task is only executed one at a time.

like image 33
MattH Avatar answered Sep 24 '22 17:09

MattH