I have converted a standalone batch job to use celery for dispatching the work to be done. I'm using RabbitMQ. Everything is running on a single machine and no other processes are using the RabbitMQ instance. My script just creates a bunch of tasks which are processed by workers.
Is there a simple way to measure the time from the start of my script until all tasks are finished? I know that this a bit complicated by design when using message queues. But I don't want to do it in production, just for testing and getting a performance estimation.
You could use celery signals, functions registered will be called before and after a task is executed, it is trivial to measure elapsed time:
from time import time
from celery.signals import task_prerun, task_postrun
d = {}
@task_prerun.connect
def task_prerun_handler(signal, sender, task_id, task, args, kwargs, **extras):
d[task_id] = time()
@task_postrun.connect
def task_postrun_handler(signal, sender, task_id, task, args, kwargs, retval, state, **extras):
try:
cost = time() - d.pop(task_id)
except KeyError:
cost = -1
print task.__name__, cost
You could use a chord by adding a fake task at the end that would be passed the time at which the tasks were sent, and that would return the difference between current time and the time passed when executed.
import celery
import datetime
from celery import chord
@celery.task
def dummy_task(res=None, start_time=None):
print datetime.datetime.now() - start_time
def send_my_task():
chord(my_task.s(), dummy_task.s(start_time=datetime.datetime.now()).delay()
send_my_task
sends the task that you want to profile along with a dummy_task
that would print how long it took (more or less). If you want more accurate numbers, I suggest passing the start_time directly to your tasks, and using the signals.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With