Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Extracting a given number of the highest values in a List

I'm seeking to display a fixed number of items on a web page according to their respective weight (represented by an Integer). The List where these items are found can be of virtually any size.

The first solution that comes to mind is to do a Collections.sort() and to get the items one by one by going through the List. Is there a more elegant solution though that could be used to prepare, say, the top eight items?

like image 356
James P. Avatar asked Apr 12 '10 20:04

James P.


3 Answers

Just go for Collections.sort(..). It is efficient enough.

This algorithm offers guaranteed n log(n) performance.

You can try to implement something more efficient for your concrete case if you know some distinctive properties of your list, but that would not be justified. Furthermore, if your list comes from a database, for example, you can LIMIT it & order it there instead of in code.

like image 188
Bozho Avatar answered Nov 15 '22 06:11

Bozho


Your options:

  1. Do a linear search, maintaining the top N weights found along the way. This should be quicker than sorting a lengthly list if, for some reason, you can't reuse the sorting results between displaying the page (e.g. the list is changing quickly).

    UPDATE: I stand corrected on the linear search necessarily being better than sorting. See Wikipedia article "Selection_algorithm - Selecting k smallest or largest elements" for better selection algorithms.

  2. Manually maintain a List (the original one or a parallel one) sorted in weight order. You can use methods like Collections.binarySearch() to determine where to insert each new item.

  3. Maintain a List (the original one or a parallel one) sorted in weight order by calling Collections.sort() after each modification, batch modifications, or just before display (possibly maintaining a modification flag to avoid sorting an already sorted list).

  4. Use a data structure that maintains sorted weight-order for you: priority queue, tree set, etc. You could also create your own data structure.

  5. Manually maintain a second (possibly weight-ordered) data structure of the top N items. This data structure is updated anytime the original data structure is modified. You could create your own data structure to wrap the original list and this "top N cache" together.

like image 37
Bert F Avatar answered Nov 15 '22 04:11

Bert F


You could use a max-heap.

If your data originates from a database, put an index on that column and use ORDER BY and TOP or LIMIT to fetch only the records you need to display.

like image 30
Mark Byers Avatar answered Nov 15 '22 04:11

Mark Byers