I have a project that keeps track of state information in over 500k objects, the program receives 10k updates/second about these objects, the updates consist of new, update or delete operations.
As part of the program house keeping must be performed on these objects roughly every five minutes, for this purpose I've placed them in a DelayQueue
implementing the Delayed
interface, allowing the blocking functionality of the DelayQueue
to control house keeping of these objects.
Upon new, an object is placed on the DelayQueue
.
Upon update, the object is remove()
'd from the DelayQueue
, updated and then reinserted at it's new position dictated by the updated information.
Upon delete, the object is remove()
'd from the DelayQueue
.
The problem I'm facing is that the remove()
method becomes a prohibitively long operation once the queue passes around 450k objects.
The program is multithreaded, one thread handles updates and another the house keeping. Due to the remove()
delay, we get nasty locking performance issues, and eventually the update thread buffer's consumes all of the heap space.
I've managed to work around this by creating a DelayedWeakReference (extends WeakReference implements Delayed)
, which allows me to leave "shadow" objects in the queue until they would expire normally.
This takes the performance issue away, but causes an significant increase in memory requirements. Doing this results in around 5 DelayedWeakReference
's for every object that actually needs to be in the queue.
Is anyone aware of a DelayQueue
with additional tracking that permits fast remove()
operations? Or has any suggestions of better ways to handle this without consuming significantly more memory?
took me some time to think about this,
but after reading your interesting question for some minutes, here are my ideas:
A. if you objects have some sort of ID, use it to hash, and actually don't have one delay queue, but have N delay queues.
This will reduce the locking factor by N.
There will be a central data structure,
holding these N queues. Since N is preconfigured,
you can create all N queues when the system starts.
If you only need to perform a housekeeping "roughly every five minutes" this is allot of work to maintain that.
What I would do is have a task which runs every minute (or less as required) to see if it has been five minutes since the last update. If you use this approach, there is no additional collection to maintain and no data structure is altered on an update. The overhead of scanning the components is increased, but is constant. The overhead of performing updates becomes trivial (setting a field with the last time updated)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With