At first glance I liked very much the "Batches" feature in Celery because I need to group an amount of IDs before calling an API (otherwise I may be kicked out). Unfortunately, when testing a little bit, batch tasks don't seem to play well with the rest of the Canvas primitives, in this case, chains. For example: <pre class="prettyprint"><code>@a.task(base=Batches, flush_every=10, flush_interval=5) def get_price(requests): for request in requests: a.backend.mark_as_done(request.id, 42, request=request) print "filter_by_price " + str([r.args[0] for r in requests]) @a.task def completed(): print("complete") </code></pre> So, with this simple workflow: <pre class="prettyprint"><code>chain(get_price.s("ID_1"), completed.si()).delay() </code></pre> I see this output: <pre class="prettyprint"><code>[2015-07-11 16:16:20,348: INFO/MainProcess] Connected to redis://localhost:6379/0 [2015-07-11 16:16:20,376: INFO/MainProcess] mingle: searching for neighbors [2015-07-11 16:16:21,406: INFO/MainProcess] mingle: all alone [2015-07-11 16:16:21,449: WARNING/MainProcess] celery@ultra ready. [2015-07-11 16:16:34,093: WARNING/Worker-4] filter_by_price ['ID_1'] </code></pre> After 5 seconds, filter_by_price() gets triggered just like expected. The problem is that completed() never gets invoked. Any ideas of what could be going on here? If not using batches, what could be a decent approach to solve this problem? PS: I have set <code>CELERYD_PREFETCH_MULTIPLIER=0</code> like the docs say.

Looks like the behaviour of batch tasks is significantly different from normal tasks. Batch tasks are not even emitting signals like task_success. Since you need to call <code>completed</code> task after <code>get_price</code>, You can call it directly from <code>get_price</code> itself. <pre class="prettyprint"><code>@a.task(base=Batches, flush_every=10, flush_interval=5) def get_price(requests): for request in requests: # do something completed.delay() </code></pre>

Celery chain not working with batches

Tags:

python

rabbitmq

celery

celery-task

At first glance I liked very much the "Batches" feature in Celery because I need to group an amount of IDs before calling an API (otherwise I may be kicked out).

Unfortunately, when testing a little bit, batch tasks don't seem to play well with the rest of the Canvas primitives, in this case, chains. For example:

@a.task(base=Batches, flush_every=10, flush_interval=5)
def get_price(requests):
    for request in requests:
        a.backend.mark_as_done(request.id, 42, request=request)
        print "filter_by_price " + str([r.args[0] for r in requests])

@a.task
def completed():
    print("complete")

So, with this simple workflow:

chain(get_price.s("ID_1"), completed.si()).delay()

I see this output:

[2015-07-11 16:16:20,348: INFO/MainProcess] Connected to redis://localhost:6379/0
[2015-07-11 16:16:20,376: INFO/MainProcess] mingle: searching for neighbors
[2015-07-11 16:16:21,406: INFO/MainProcess] mingle: all alone
[2015-07-11 16:16:21,449: WARNING/MainProcess] celery@ultra ready.
[2015-07-11 16:16:34,093: WARNING/Worker-4] filter_by_price ['ID_1']

After 5 seconds, filter_by_price() gets triggered just like expected. The problem is that completed() never gets invoked.

Any ideas of what could be going on here? If not using batches, what could be a decent approach to solve this problem?

PS: I have set CELERYD_PREFETCH_MULTIPLIER=0 like the docs say.

944

asked Jul 11 '15 19:07

ninja.user

1 Answers

Looks like the behaviour of batch tasks is significantly different from normal tasks. Batch tasks are not even emitting signals like task_success.

Since you need to call completed task after get_price, You can call it directly from get_price itself.

@a.task(base=Batches, flush_every=10, flush_interval=5)
def get_price(requests):
    for request in requests:
         # do something
    completed.delay()

182

answered Oct 04 '22 15:10

Pandikunta Anand Reddy

Related questions
                            
                                Pycharm set the correct environment variable PATH
                            
                                Python easy_install in a virtualenv gives setuptools error
                            
                                Why do I get a "404 Not Found" error even though the link is on the server?
                            
                                How to install python-devel when using virtualenv
                            
                                Storing pandas DataFrames in SQLAlchemy models
                            
                                Getting "If-Match or If-None-Match header or entry etag attribute required" errors when batch deleting contacts
                            
                                Is there any analogue of EXIT_SUCCESS and EXIT_FAILURE macros in Python 2.7.6
                            
                                missing column after pandas groupby
                            
                                Python bcolz how to merge two ctables
                            
                                Increase C++ regex replace performance
                            
                                pandas and numpy thread safety
                            
                                How to compare two dates in Jinja2?
                            
                                Extracting stderr from pexpect
                            
                                python27 matplotlib: first and last element connected
                            
                                Why are there two np.int64s in numpy.core.numeric._typelessdata (Why is numpy.int64 not numpy.int64?)
                            
                                Resampling in scikit-learn and/or pandas
                            
                                Parsing multiple sentences with MaltParser using NLTK
                            
                                How to create jsonb index using GIN on SQLAlchemy?
                            
                                python argparse -- customizing error messages
                            
                                Python: multiprocessing, 8/24 cores loaded

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With