In a node.js app I'm using the kue queueing library, which is backed by redis. When a job is complete I remove it from the queue. After running about 70,000 jobs overnight the redis memory usage is at approx 30MB. There were 18 failed jobs still in the database, and the queue length is currently zero - jobs are processed more quickly than they are queuing. Redis is not being used in any other way.
Any ideas why the redis memory usage keeps increasing even though I'm deleting the completed jobs? Coffeescript code:
gaemodel.update = (params) ->
job = jobs.create "gaemodel-update", params
job.attempts 2
job.save()
job.on "complete", ->
job.remove (err) ->
throw err if err
console.log 'completed job #%d', job.id
When you have a memory consumption issue with a queuing system, and you are 100% positive that all the queued items have been removed from the store and do not sit into an exception/error queue, then the most probable cause is the fact the queueing rate is much higher than the dequeuing rate.
Redis uses a general purpose memory allocator (jemalloc, ptmalloc, tcmalloc, etc ...). These allocators do not necessarily give the memory back to the system. When some memory is freed, the allocator tends to keep it (to reuse it for a future allocation). This is especially true when many small objects are randomly allocated, which is typically the case with Redis.
The consequence is a peak of memory consumption at a given point in time will cause Redis to accumulate memory and keep it. This memory is not lost, it will be reused if another peak of memory consumption occurs. But from the system point of view, memory is still assigned to Redis. For a queuing system, if you queue the items faster than you are able to dequeue them, you will have such peak in memory consumption.
My advice would be to instrument your application to fetch and log the queue length at regular time intervals to check the evolution of the number of items in the queue (and identify the peak value).
Updated:
I have tested a few things with kue to understand what it stores in Redis. Actually, the data structure is quite complex (a mix of strings, sets, zsets, and hashs). If you look into Redis, you will find the following:
q:job:nnn (hash, job definition and properties)
q:search:object:nnn (set, metaphone tokens associated to job nnn)
q:search:word:XXXXX (set, reverse index to support job full-text indexing)
q:jobs:inactive (zset, all the unprocessed jobs)
q:jobs:X:inactive (zset, all the unprocessed jobs of job type X)
q:jobs:active (zset, all the on-going jobs)
q:jobs:X:active (zset, all the on-going jobs of job type X)
q:jobs:complete (zset, all the completed jobs)
q:jobs:X:complete (zset, all the completed jobs of job type X)
q:jobs:failed (zset, all the failed jobs)
q:jobs:X:failed (zset, all the failed jobs of job type X)
q:jobs:delayed (zset, all the delayed jobs)
q:jobs:X:delayed (zset, all the delayed jobs of job type X)
q:job:types (set, all the job types)
q:jobs (zset, all the jobs)
q:stats:work-time (string, work time statistic)
q:ids (string, job id sequence)
I don't know Coffeescript at all, so I tried to reproduce the problem using plain old Javascript:
var kue = require('kue'),
jobs = kue.createQueue();
jobs.process( 'email', function(job,done) {
console.log('Processing email '+JSON.stringify(job) )
done();
});
function create_email(i) {
var j = jobs.create('email', {
title: 'This is email '+i
, to: 'didier'
, template: 'Bla bla bla'
});
j.on('complete', function() {
console.log('complete email job #%d', j.id);
j.remove(function(err){
if (err) throw err;
console.log('removed completed job #%d', j.id);
});
});
j.save();
}
for ( i=0; i<5; ++i )
{
create_email(i);
}
kue.app.listen(8080);
I ran this code, checking what remained in Redis after processing:
redis 127.0.0.1:6379> keys *
1) "q:ids"
2) "q:jobs:complete"
3) "q:jobs:email:complete"
4) "q:stats:work-time"
5) "q:job:types"
redis 127.0.0.1:6379> zrange q:jobs:complete 0 -1
1) "1"
2) "2"
3) "3"
4) "4"
5) "5"
So it seems completed jobs are kept in q:jobs:complete and q:jobs:X:complete despite the jobs have been deleted. I suggest you check the cardinality of these zsets in your own Redis instance.
My explanation is management of these zset occurs after the 'completed' event is emitted. So the jobs are correctly removed, but their ids are inserted in those zsets just after.
A workaround is to avoid relying on per-job events, but rather use the per-queue events to remove the jobs. For instance, the following modifications can be done:
// added this
jobs.on('job complete', function(id) {
console.log('Job complete '+id )
kue.Job.get(id, function(err, job) {
if (err) return;
job.remove(function(err){
if (err) throw err;
console.log('removed completed job #%d', job.id);
});
});
});
// updated that
function create_email(i) {
var j = jobs.create('email', {
title: 'This is email '+i
, to: 'didier'
, template: 'Bla bla bla'
});
j.save();
}
After fixing the program, the content in Redis is much better:
redis 127.0.0.1:6379> keys *
1) "q:stats:work-time"
2) "q:ids"
3) "q:job:types"
You can probably use a similar strategy from Coffescript.
Glad to see you fixed your problem. In any case, next time you have a memory problem with Redis, your first port of call should be the "INFO" redis command. This command will tell you valuable information such as
used_memory:3223928 used_memory_human:3.07M used_memory_rss:1916928 used_memory_peak:3512536 used_memory_peak_human:3.35M used_memory_lua:37888 mem_fragmentation_ratio:0.59
Or
db0:keys=282,expires=27,avg_ttl=11335089640
Which is very handy for understanding the status of your memory and the keyspace at any given moment.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With