Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Extremly bad perfomance of ArrayList<Int> for large number of elements compared with IntArray

I made 2 implementations of quick sort: sequential and parallel. The second one was written using ForkJoinPool using 4 threads (I added implementaton below) The sort function worked with ArrayList:

override fun sort(array: ArrayList<Int>): ArrayList<Int> { 
   ...
}

I tested both of them on arrays of size 1e3-1e7, and for all of the sizes parallel implementation was ~2.5 times faster then sequential. But starting from the size=1e8 parallel algorithm works about 2 times slower then sequential one.

class ParQuickSort(
    private val seqBlockSize:Int = 1000,
    nThreads: Int = 4
) : QuickSort {

    private val threadPool = ForkJoinPool(nThreads)

    override fun sort(array: IntArray): IntArray {
        threadPool.invoke(
            SortTask(array, 0, array.size - 1, seqBlockSize)
        )
        return array
    }

    fun shutdown() {
        threadPool.shutdownNow()
    }

    private class SortTask(
        val array: IntArray,
        val l: Int,
        val r: Int,
        val seqBlockSize: Int
    ) : RecursiveTask<Unit>() {

        private val rand = Random()

        private val seqSort = SeqQuickSort()

        override fun compute() {
            if (l < r) {
                if (r - l <= seqBlockSize) {
                    seqSort.quickSortInterval(array, l, r)
                    return
                }
                val m = partition(array, l, r)

                invokeAll(
                    SortTask(array, l, m, seqBlockSize),
                    SortTask(array, m + 1, r, seqBlockSize)
                )
            }
        }

        private fun partition(array: IntArray, l: Int, r: Int): Int {
            ...
        }

        private fun swap(array: IntArray, first: Int, second: Int) {
            ...
        }

    }
}

When I changed ArrayList to IntArray parallel implementation showed better result for 1e8 array size (~2.5 times faster then sequential as for smaller arrays) as I expected.

I know abount disadventeges of ArrayList - autoboxing, unboxing e.t.c, and that IntArray is equivalent to int[] so it works with primitives. But why parallel implementation using ArrayList started to give bad perfromance only for large number of elements (1e8)?

like image 437
avelobos Avatar asked Oct 12 '25 16:10

avelobos


1 Answers

ArrayList<Integer> takes somewhere on the order of 64+64+64 bits of memory per integer:

  • Per value in the list, an Integer instance is created. This instance is ~128 bits large: 64 bits are occupied by the object header (which, amongst other things, identifies this object as being of the Integer type - objects know what they are, this takes some memory).
  • ... and that instance has a field containing the value. You'd think this is only 32 bits. It's not: On 64-bit processors, doing things to memory that isn't nicely located exactly on a spot that is evenly divisible by 64 bit, is very annoying. Hence, effectively, 64 bits of memory are 'occupied' (in the sense of 'no longer available for other stuff') by that 32 bit variable.
  • the list itself is a sequence of pointers, pointing at those integer objects (in java parlance, 'reference'. They are pointers). Depending on various aspects of the JVM, each pointer is either 64 bit or 32 bit (compressed OOPS and other tricks the JVM does to compress pointers may apply here).

In contrast, an int[] is simple: Each value takes up 32 bits. No more and no less.

That's a factor 6 - List<Integer> takes ~6x more memory to store Y integers than int[] does. (in kotlin, List<Int> is, as far as I know, an alias for List<Integer>.

Thus, likely, the reason performance falls off a cliff at that point is that your main memory is full and swapping is going on. Swapping is already expensive; it tends to get more expensive when parallelism is introduced.

like image 62
rzwitserloot Avatar answered Oct 14 '25 06:10

rzwitserloot