Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java OutOfMemory during sorting

I have a task about building a pyramid using list of numbers, but there is one problem with one test. In my task I need to sort a list. I use Collections.sort():

Collections.sort(inputNumbers, (o1, o2) -> {
            if (o1 != null && o2 != null) {
                return o1.compareTo(o2);
            } else {
                throw new CannotBuildPyramidException("Unable to build a pyramid");
            }
        });

But this test fails

@Test(expected = CannotBuildPyramidException.class)
    public void buildPyramid8() {
        // given
            List<Integer> input = Collections.nCopies(Integer.MAX_VALUE - 1, 0);

        // run
        int[][] pyramid = pyramidBuilder.buildPyramid(input);

        // assert (exception)
    }

with OutOfMemoryError instead of my own CannotBuildPyramidException(it will be thrown in another method after sorting). I understand that it is because of TimSort in Collections.sort() method. I tried to use HeapSort, but I couldn`t even swap elements because my input list was initialized as Arrays.asList() and when I use set() method I get UnsupportedOperationException. Then I tried to convert my list to common ArrayList

ArrayList<Integer> list = new ArrayList<>(inputNumbers);

but I got OutOfMemoryError again. It`s not allowed to edit tests. I dont know what to do with this problem. Im using Java8 and IntelliJIdea SDK

like image 397
PashaMinigun Avatar asked Jan 22 '20 14:01

PashaMinigun


2 Answers

Note that the list created by Collections.nCopies(Integer.MAX_VALUE - 1, 0) uses a tiny amount of memory and is immutable. The documentation says "Returns an immutable list consisting of n copies of the specified object. The newly allocated data object is tiny (it contains a single reference to the data object)". And if you look at the implementation, you'll see it does exactly what one would expect from that description. It returns a List object that only pretends to be large, only holding the size and the element once and returning that element when asked about any index.

The problem with Collections.sort is then two-fold:

  • The list must not be immutable, but that list is. That btw also explains the UnsupportedOperationException you got when you tried to set().
  • For performance reasons, it "obtains an array containing all elements in this list, sorts the array, [and writes back to the list]". So at this point the tiny pretend-list does get blown up and causes your memory problem.

So you need to find some other way to sort. One that works in-place and doesn't swap anything for this input (which is correct, as the list is already sorted). You could for example use bubble sort, which takes O(n) time and O(1) space on this input and doesn't attempt any swaps here.

Btw, about getting the memory problem "because of TimSort": Timsort is really not to blame. You don't even get to the Timsort part, as it's the preparatory copy-to-array that causes the memory problem. And furthermore, Timsort is smart and would detect that the data is already sorted and then wouldn't do anything. So if you actually did get to the Timsort part, or if you could directly apply it to the list, Timsort wouldn't cause a problem.

like image 182
Kelly Bundy Avatar answered Nov 15 '22 00:11

Kelly Bundy


This list is too huge! Collections.nCopies(Integer.MAX_VALUE - 1, 0); gives us list of 2^31-1 elements (2147483647), each one taking about 4 bytes in memory (this is "simplified" size of Integer). If we multiply it, we'll have about 8.59 GB of memory required to store all those numbers. Are you sure you have enough memory to store it?

I believe this test is written in a very bad manner - one should never try to create such huge List.

like image 37
Greg Witczak Avatar answered Nov 14 '22 23:11

Greg Witczak