I'm using the Perl client of beanstalkd. I need a simple way to not enqueue the same work twice.
I need something that needs to basically wait until there are K elements, and then groups them together. To accomplish this, I have the producer:
insert item(s) into DB
insert a queue item into beanstalkd
And the consumer:
while ( 1 ) {
beanstalkd.retrieve
if ( DB items >= K )
func_to_process_all_items
kill job
}
This is linear in the number of requests/processing, but in the case of:
insert 1 item
... repeat many times ...
insert 1 item
Assuming all these insertions happened before a job was retrieved, this would add N queue items, and it would do something as such:
check DB, process N items
check DB, no items
... many times ...
check DB, no items
Is there a smarter way to do this so that it does not insert/process the later job requests unnecessarily?
I had a related requirement. I only wanted to process a specific job once within a few minutes, but the producer could queue several instances of the same job. I used memcache to store the job identifier and set the expiry of the key to just a few minutes.
When a worker tried to add the job identifier to memcache, only the first would succeed - on failure to add the job id, the worker would delete the job. After a few minutes, the key expires from memcache and the job can be processed again.
Not particularly elegant, but it works.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With