Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Memory consumption of java Collection.sort()

I have an ArrayList filled with 1.5 million objects of some class. When I sort this list by usage of the Collection.sort method the allocated memory of the JVM increases dramatically.

So my questions are:

Is that normal? What could be reasons for that? Is this a matter of the garbage collector working too slowly or not being started often enough? Do the objects in the list have to fulfill certain specifications to consume less memory during sort (besides not containing that much data)?

Thx!

like image 263
vern Avatar asked Sep 26 '22 15:09

vern


People also ask

What is complexity of Collections sort () in Java?

The time complexity of Collections. sort() is O(n*log(n)) and a list sorted with Collections. sort() will only be sorted after the call to sort().

Is Collections sort Fast?

From the Documentation on Collection's method sort(): It means that is O(n log n) in the worst case. So YES it's very fast (even in the worst case), much faster than a O(n^2) sorting algorithm. Save this answer.

What does collection sort () do?

Collections sort is a method of Java Collections class used to sort a list, which implements the List interface. All the elements in the list must be mutually comparable. If a list consists of string elements, then it will be sorted in alphabetical order.

Is Java Collections sort stable?

Collections. sort to sort object references is a "modified mergesort (in which the merge is omitted if the highest element in the low sublist is less than the lowest element in the high sublist)." It is a reasonably fast stable sort that guarantees O(n log n) performance and requires O(n) extra space.


1 Answers

In order to sort a List, the default sorting implementation first creates an array-copy of all elements that are to be sorted. This causes the additional heap consumption that you observe while sorting. This copying is necessary since a generic sorting algorithm has no knowledge of the list's structure, for example if it is random-access or not.

For Java 8, the sorting implementation was however changed to be delegated to each implementation of a List. This became possible with using default methods. For an ArrayList, this additional overhead could be removed by implementing a more efficient sorting algorithm. An upgrade to Java 8 would therefore most likely resolve your problem.

There is nothing wrong with garbage collection for your problem. Large arrays are unfortunately heavy to handle because they probably do not fit into the young generation and can eventually trigger a full collection.

Furthermore, as mentioned in the comments, the actual sorting is performed via Tim Sort since Java 7 by the Arrays::sort implementation. Tim sort requires additional heap space. From the javadoc:

Temporary storage requirements vary from a small constant for nearly sorted input arrays to n/2 object references for randomly ordered input arrays.

If this is not applicable for your use case, you can switch back to the previous merge-sort implementation by setting the system property java.util.Arrays.useLegacyMergeSort to true.

After all, Tim sort is however still more efficient than merge sort as merge sort requires another full array copy.

like image 88
Rafael Winterhalter Avatar answered Sep 29 '22 07:09

Rafael Winterhalter