Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

DelayQueue with higher speed remove()?

I have a project that keeps track of state information in over 500k objects, the program receives 10k updates/second about these objects, the updates consist of new, update or delete operations.

As part of the program house keeping must be performed on these objects roughly every five minutes, for this purpose I've placed them in a DelayQueue implementing the Delayed interface, allowing the blocking functionality of the DelayQueue to control house keeping of these objects.

  • Upon new, an object is placed on the DelayQueue.

  • Upon update, the object is remove()'d from the DelayQueue, updated and then reinserted at it's new position dictated by the updated information.

  • Upon delete, the object is remove()'d from the DelayQueue.

The problem I'm facing is that the remove() method becomes a prohibitively long operation once the queue passes around 450k objects.

The program is multithreaded, one thread handles updates and another the house keeping. Due to the remove() delay, we get nasty locking performance issues, and eventually the update thread buffer's consumes all of the heap space.

I've managed to work around this by creating a DelayedWeakReference (extends WeakReference implements Delayed), which allows me to leave "shadow" objects in the queue until they would expire normally.

This takes the performance issue away, but causes an significant increase in memory requirements. Doing this results in around 5 DelayedWeakReference's for every object that actually needs to be in the queue.

Is anyone aware of a DelayQueue with additional tracking that permits fast remove() operations? Or has any suggestions of better ways to handle this without consuming significantly more memory?

like image 224
CuddlyDragon Avatar asked Nov 16 '12 16:11

CuddlyDragon


Video Answer


2 Answers


took me some time to think about this,
but after reading your interesting question for some minutes, here are my ideas:
A. if you objects have some sort of ID, use it to hash, and actually don't have one delay queue, but have N delay queues.
This will reduce the locking factor by N.
There will be a central data structure,
holding these N queues. Since N is preconfigured,
you can create all N queues when the system starts.

like image 182
Yair Zaslavsky Avatar answered Sep 18 '22 13:09

Yair Zaslavsky


If you only need to perform a housekeeping "roughly every five minutes" this is allot of work to maintain that.

What I would do is have a task which runs every minute (or less as required) to see if it has been five minutes since the last update. If you use this approach, there is no additional collection to maintain and no data structure is altered on an update. The overhead of scanning the components is increased, but is constant. The overhead of performing updates becomes trivial (setting a field with the last time updated)

like image 32
Peter Lawrey Avatar answered Sep 21 '22 13:09

Peter Lawrey