Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

A priority queue which allows efficient priority update?

Tags:

UPDATE: Here's my implementation of Hashed Timing Wheels. Please let me know if you have an idea to improve the performance and concurrency. (20-Jan-2009)

// Sample usage:
public static void main(String[] args) throws Exception {
    Timer timer = new HashedWheelTimer();
    for (int i = 0; i < 100000; i ++) {
        timer.newTimeout(new TimerTask() {
            public void run(Timeout timeout) throws Exception {
                // Extend another second.
                timeout.extend();
            }
        }, 1000, TimeUnit.MILLISECONDS);
    }
}

UPDATE: I solved this problem by using Hierarchical and Hashed Timing Wheels. (19-Jan-2009)

I'm trying to implement a special purpose timer in Java which is optimized for timeout handling. For example, a user can register a task with a dead line and the timer could notify a user's callback method when the dead line is over. In most cases, a registered task will be done within a very short amount of time, so most tasks will be canceled (e.g. task.cancel()) or rescheduled to the future (e.g. task.rescheduleToLater(1, TimeUnit.SECOND)).

I want to use this timer to detect an idle socket connection (e.g. close the connection when no message is received in 10 seconds) and write timeout (e.g. raise an exception when the write operation is not finished in 30 seconds.) In most cases, the timeout will not occur, client will send a message and the response will be sent unless there's a weird network issue..

I can't use java.util.Timer or java.util.concurrent.ScheduledThreadPoolExecutor because they assume most tasks are supposed to be timed out. If a task is cancelled, the cancelled task is stored in its internal heap until ScheduledThreadPoolExecutor.purge() is called, and it's a very expensive operation. (O(NlogN) perhaps?)

In traditional heaps or priority queues I've learned in my CS classes, updating the priority of an element was an expensive operation (O(logN) in many cases because it can only be achieved by removing the element and re-inserting it with a new priority value. Some heaps like Fibonacci heap has O(1) time of decreaseKey() and min() operation, but what I need at least is fast increaseKey() and min() (or decreaseKey() and max()).

Do you know any data structure which is highly optimized for this particular use case? One strategy I'm thinking of is just storing all tasks in a hash table and iterating all tasks every second or so, but it's not that beautiful.

like image 980
trustin Avatar asked Jan 16 '09 11:01

trustin


3 Answers

How about trying to separate the handing of the normal case where things complete quickly from the error cases?

Use both a hash table and a priority queue. When a task is started it gets put in the hash table and if it finishes quickly it gets removed in O(1) time.

Every one second you scan the hash table and any tasks that have been a long time, say .75 seconds, get moved to the priority queue. The priority queue should always be small and easy to handle. This assumes that one second is much less than the timeout times you are looking for.

If scanning the hash table is too slow, you could use two hash tables, essentially one for even-numbered seconds and one for odd-numbered seconds. When a task gets started it is put in the current hash table. Every second move all the tasks from the non-current hash table into the priority queue and swap the hash tables so that the current hash table is now empty and the non-current table contains the tasks started between one and two seconds ago.

There options are a lot more complicated than just using a priority queue, but are pretty easily implemented should be stable.

like image 196
David Norman Avatar answered Oct 12 '22 09:10

David Norman


To the best of my knowledge (I wrote a paper about a new priority queue, which also reviewed past results), no priority queue implementation gets the bounds of Fibonacci heaps, as well as constant-time increase-key.

There is a small problem with getting that literally. If you could get increase-key in O(1), then you could get delete in O(1) -- just increase the key to +infinity (you can handle the queue being full of lots of +infinitys using some standard amortization tricks). But if find-min is also O(1), that means delete-min = find-min + delete becomes O(1). That's impossible in a comparison-based priority queue because the sorting bound implies (insert everything, then remove one-by-one) that

n * insert + n * delete-min > n log n.

The point here is that if you want a priority-queue to support increase-key in O(1), then you must accept one of the following penalties:

  • Not be comparison based. Actually, this is a pretty good way to get around things, e.g. vEB trees.
  • Accept O(log n) for inserts and also O(n log n) for make-heap (given n starting values). This sucks.
  • Accept O(log n) for find-min. This is entirely acceptable if you never actually do find-min (without an accompanying delete).

But, again, to the best of my knowledge, no one has done the last option. I've always seen it as an opportunity for new results in a pretty basic area of data structures.

like image 40
A. Rex Avatar answered Oct 12 '22 10:10

A. Rex


Use Hashed Timing Wheel - Google 'Hashed Hierarchical Timing Wheels' for more information. It's a generalization of the answers made by people here. I'd prefer a hashed timing wheel with a large wheel size to hierarchical timing wheels.

like image 31
trustin Avatar answered Oct 12 '22 10:10

trustin