Related questions:
I have a very large data set (more than 5 millions items) and I need to get N largest items from it. The most natural way to do it is to use heap/priority queue storing only top N items. There are several good implementations of priority queue for JVM (Scala/Java), namely:
First 2 are nice, but they store all the items, which in my case gives critical memory overhead. Third (Lucene implementation) doesn't have such a drawback, but as I can see from documentation it also doesn't support custom comparator, which makes it useless for me.
So, my question is: Is there a PriorityQueue
implementation with fixed capacity and custom comparator?
UPD. Finally I've created my own implementation, based on Peter's answer:
public class FixedSizePriorityQueue<E> extends TreeSet<E> { private int elementsLeft; public FixedSizePriorityQueue(int maxSize) { super(new NaturalComparator()); this.elementsLeft = maxSize; } public FixedSizePriorityQueue(int maxSize, Comparator<E> comparator) { super(comparator); this.elementsLeft = maxSize; } /** * @return true if element was added, false otherwise * */ @Override public boolean add(E e) { if (elementsLeft == 0 && size() == 0) { // max size was initiated to zero => just return false return false; } else if (elementsLeft > 0) { // queue isn't full => add element and decrement elementsLeft boolean added = super.add(e); if (added) { elementsLeft--; } return added; } else { // there is already 1 or more elements => compare to the least int compared = super.comparator().compare(e, this.first()); if (compared == 1) { // new element is larger than the least in queue => pull the least and add new one to queue pollFirst(); super.add(e); return true; } else { // new element is less than the least in queue => return false return false; } } } }
(where NaturalComparator
is taken from this question)
PriorityQueue. comparator() method shares an important function of setting and returning the comparator that can be used to order the elements in a PriorityQueue. The method returns a null value if the queue follows the natural ordering pattern of the elements. Parameters: The method does not take any parameters.
Class PriorityQueue<E> An unbounded priority queue based on a priority heap. The elements of the priority queue are ordered according to their natural ordering, or by a Comparator provided at queue construction time, depending on which constructor is used. A priority queue does not permit null elements.
Implementation of Priority Queue Priority queue can be implemented using an array, a linked list, a heap data structure, or a binary search tree. Among these data structures, heap data structure provides an efficient implementation of priority queues.
The priority queue supports only comparable elements, which means that the elements are either arranged in an ascending or descending order. For example, suppose we have some values like 1, 3, 4, 8, 14, 22 inserted in a priority queue with an ordering imposed on the values is from least to the greatest.
How can you say Lucene's doesn't support a custom comparator?
Its abstract and you must implement the abstract method lessThan(T a, T b)
You could use a SortedSet e.g. TreeSet with a custom comparator and remove the smallest when the size reachs N.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With