I'm adding millions of entries of a custom object to a List
. This turns out to be very slow and the graphical user interface keeps freezing randomly as well (not responding on Windows) even though the adding operation is wrapped in a SwingWorker
so it should not affect the EDT. Furthermore, the CPU
utilization goes up to around 70% - 90%.
I'm initializing the list as follows:
List<SearchResult> updatedSearchResults = new ArrayList<>();
Then I'm repeatedly calling
SearchResult searchResult = new SearchResult(...);
updatedSearchResults.add(searchResult);
to add elements.
I'm aware that the ArrayList
internally uses an array which is resized when it is full and all elements are copied over. However, using a LinkedList
has shown to be slower than the ArrayList
though and no other List implementation exists.
According to this answer, adding to an ArrayList
has a performance of O(1)
. The question arises why my approach of collecting all results is still so awful performance-wise. SearchResult
is a simple data object encapsulating fields and initializing it is cheap since only fields are set.
As comparison, adding to the list is noticeably slower than sending the same amount of data over a network socket. This doesn't make any sense since local operations should always be way faster than network related operations.
1 million
elements finishes in about 0.2 seconds
but 67 million
won't even finish in a reasonable amount of time. It should finish like a few seconds at maximum.
import java.io.ByteArrayOutputStream;
import java.io.IOException;
import java.math.BigInteger;
import java.util.ArrayList;
import java.util.List;
public class ListPerformanceTester
{
public enum ValueSize
{
EIGHT_BIT("8-Bit"),
SIXTEEN_BIT("16-Bit"),
THIRTY_TWO_BIT("32-Bit"),
SIXTY_FOUR_BIT("64-Bit"),
NINETY_SIX_BIT("96-Bit");
private String name;
ValueSize(String name)
{
this.name = name;
}
public int getBytesCount()
{
switch (this)
{
case EIGHT_BIT:
return 1;
case SIXTEEN_BIT:
return 2;
case THIRTY_TWO_BIT:
return 4;
case SIXTY_FOUR_BIT:
return 8;
case NINETY_SIX_BIT:
return 12;
}
throw new IllegalStateException("Bytes count undefined for " + this);
}
@Override
public String toString()
{
return name;
}
public static ValueSize parse(String name)
{
ValueSize[] valueSizes = values();
for (ValueSize valueSize : valueSizes)
{
String currentName = valueSize.toString();
if (currentName.equals(name))
{
return valueSize;
}
}
throw new IllegalArgumentException("No value size associated");
}
}
public static class SearchResult implements Cloneable, Comparable
{
private int address;
private ValueSize valueSize;
private BigInteger previousValue;
private BigInteger currentValue;
private BigInteger valueDifference;
public SearchResult(int address, BigInteger previousValue,
BigInteger currentValue, ValueSize valueSize)
{
this.address = address;
this.valueSize = valueSize;
this.previousValue = previousValue;
this.currentValue = currentValue;
setValueDifference();
}
private void setValueDifference()
{
BigInteger subtractionResult = previousValue.subtract(currentValue);
valueDifference = subtractionResult.abs();
}
private byte[] getBytes(BigInteger bigInteger) throws IOException
{
byte[] retrieved = bigInteger.toByteArray();
ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream();
int bytesCount = getValueSize().getBytesCount();
int paddingBytes = bytesCount - retrieved.length;
if (paddingBytes >= 0)
{
byteArrayOutputStream.write(new byte[paddingBytes]);
byteArrayOutputStream.write(retrieved);
} else
{
writeWithoutLeadingNullBytes(byteArrayOutputStream, retrieved);
}
return byteArrayOutputStream.toByteArray();
}
private void writeWithoutLeadingNullBytes(ByteArrayOutputStream byteArrayOutputStream, byte[] bytes)
{
int index = 0;
boolean nonNullByteFound = false;
while (index < bytes.length)
{
byte value = bytes[index];
if (value != 0 || nonNullByteFound)
{
nonNullByteFound = true;
byteArrayOutputStream.write(value);
}
index++;
}
}
public int getAddress()
{
return address;
}
@Override
public boolean equals(Object object)
{
if (!(object instanceof SearchResult))
{
return false;
}
SearchResult searchResult = (SearchResult) object;
return searchResult.getAddress() == getAddress();
}
@Override
public int hashCode()
{
return Integer.hashCode(address);
}
public ValueSize getValueSize()
{
return valueSize;
}
@Override
public SearchResult clone()
{
return new SearchResult(address, previousValue, currentValue, valueSize);
}
@Override
public int compareTo(Object object)
{
return new Integer(address).compareTo(((SearchResult) object).getAddress());
}
}
public static void main(String[] arguments)
{
long milliseconds = System.currentTimeMillis();
int elementsCount = 2000000;
/*List<Integer> list = new ArrayList<>();
for (int elementIndex = 0; elementIndex < elementsCount; elementIndex++)
{
list.add(0);
}*/
List<SearchResult> searchResults = new ArrayList<>();
for (int elementIndex = 0; elementIndex < elementsCount; elementIndex++)
{
SearchResult searchResult = new SearchResult(0x12345678, new BigInteger("3"), new BigInteger("1"), ValueSize.EIGHT_BIT);
searchResults.add(searchResult);
}
System.out.println((System.currentTimeMillis() - milliseconds) / (double) 1000 + " seconds");
}
}
What is the big issue here?
ArrayList internally uses and array to store the elements, when that array gets filled by inserting elements a new array of roughly 1.5 times the size of the original array is created and all the data of old array is copied to new array.
ArrayList s are more than twice as slow as the corresponding array.
2) Manipulation with ArrayList is slow because it internally uses an array. If any element is removed from the array, all the other elements are shifted in memory. Manipulation with LinkedList is faster than ArrayList because it uses a doubly linked list, so no bit shifting is required in memory.
Array is faster and that is because ArrayList uses a fixed amount of array. However when you add an element to the ArrayList and it overflows. It creates a new Array and copies every element from the old one to the new one. You will only feel this if you add to often.
You're running out of memory and the JVM begins continuous Garbage Collection.
To show this, I created this program:
public class Test {
public static void main(String[] args) {
List<Long> list = new ArrayList<>();
long start = System.currentTimeMillis();
for (long i = 0; i < 100_000_000; i++) {
if (i % 1_000_000 == 0)
System.out.printf("%9d: %d ms%n", i, System.currentTimeMillis() - start);
list.add(new Long(i));
}
System.out.printf("Total : %d ms%n", System.currentTimeMillis() - start);
}
}
I running with jdk1.8.0_91, 64-bit on Windows 10.
Running with options -XX:+PrintGCDetails -Xmx1g
, I see:
0: 1 ms
[GC (Allocation Failure) [PSYoungGen: 29801K->5117K(38400K)] 29801K->22565K(125952K), 0.0133554 secs] [Times: user=0.00 sys=0.00, real=0.01 secs]
1000000: 31 ms
[GC (Allocation Failure) [PSYoungGen: 38397K->5120K(71680K)] 55845K->48026K(159232K), 0.0196297 secs] [Times: user=0.13 sys=0.00, real=0.02 secs]
2000000: 62 ms
3000000: 71 ms
[GC (Allocation Failure) [PSYoungGen: 71680K->5104K(71680K)] 114586K->104090K(171008K), 0.0391813 secs] [Times: user=0.34 sys=0.03, real=0.04 secs]
[Full GC (Ergonomics) [PSYoungGen: 5104K->0K(71680K)] [ParOldGen: 98986K->96640K(216064K)] 104090K->96640K(287744K), [Metaspace: 2754K->2754K(1056768K)], 0.5258703 secs] [Times: user=1.59 sys=0.00, real=0.53 secs]
4000000: 643 ms
5000000: 650 ms
[GC (Allocation Failure) [PSYoungGen: 66560K->5120K(99328K)] 163200K->162716K(315392K), 0.0463488 secs] [Times: user=0.36 sys=0.02, real=0.05 secs]
... 30 lines removed ...
28000000: 8269 ms
29000000: 8273 ms
30000000: 8277 ms
[Full GC (Ergonomics) [PSYoungGen: 140722K->140712K(232960K)] [ParOldGen: 699038K->699038K(699392K)] 839760K->839750K(932352K), [Metaspace: 2768K->2768K(1056768K)], 2.1019108 secs] [Times: user=14.30 sys=0.06, real=2.10 secs]
[Full GC (Ergonomics) [PSYoungGen: 140722K->140722K(232960K)] [ParOldGen: 699038K->699038K(699392K)] 839760K->839760K(932352K), [Metaspace: 2768K->2768K(1056768K)], 1.9041177 secs] [Times: user=12.95 sys=0.02, real=1.90 secs]
[Full GC (Ergonomics) [PSYoungGen: 140722K->140722K(232960K)] [ParOldGen: 699038K->699038K(699392K)] 839760K->839760K(932352K), [Metaspace: 2768K->2768K(1056768K)], 1.9111176 secs] [Times: user=13.13 sys=0.02, real=1.91 secs]
[Full GC (Ergonomics) [PSYoungGen: 140722K->140722K(232960K)] [ParOldGen: 699039K->699039K(699392K)] 839762K->839762K(932352K), [Metaspace: 2768K->2768K(1056768K)], 1.8883737 secs] [Times: user=12.72 sys=0.03, real=1.89 secs]
[Full GC (Ergonomics) [PSYoungGen: 140722K->140722K(232960K)] [ParOldGen: 699041K->699041K(699392K)] 839763K->839763K(932352K), [Metaspace: 2768K->2768K(1056768K)], 1.8866147 secs] [Times: user=13.01 sys=0.00, real=1.89 secs]
[Full GC (Ergonomics) [PSYoungGen: 140722K->140722K(232960K)] [ParOldGen: 699042K->699042K(699392K)] 839765K->839765K(932352K), [Metaspace: 2768K->2768K(1056768K)], 1.8867966 secs] [Times: user=12.70 sys=0.03, real=1.89 secs]
... program killed by me ...
If I preallocate memory using -XX:+PrintGCDetails -Xmx1g -Xms1g
, I see:
0: 1 ms
1000000: 19 ms
2000000: 36 ms
3000000: 49 ms
... 30 lines remove ...
29000000: 6320 ms
30000000: 6324 ms
31000000: 6328 ms
[Full GC (Ergonomics) [PSYoungGen: 157018K->152753K(305664K)] [ParOldGen: 699350K->699350K(699392K)] 856368K->852103K(1005056K), [Metaspace: 2769K->2769K(1056768K)], 2.0326459 secs] [Times: user=13.84 sys=0.05, real=2.03 secs]
[Full GC (Allocation Failure) [PSYoungGen: 152753K->152753K(305664K)] [ParOldGen: 699350K->699338K(699392K)] 852103K->852091K(1005056K), [Metaspace: 2769K->2769K(1056768K)], 1.9620070 secs] [Times: user=12.70 sys=0.05, real=1.96 secs]
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:3210)
at java.util.Arrays.copyOf(Arrays.java:3181)
at java.util.ArrayList.grow(ArrayList.java:261)
at java.util.ArrayList.ensureExplicitCapacity(ArrayList.java:235)
at java.util.ArrayList.ensureCapacityInternal(ArrayList.java:227)
at java.util.ArrayList.add(ArrayList.java:458)
at Test.main(Test.java:13)
Heap
PSYoungGen total 305664K, used 162579K [0x00000000eab00000, 0x0000000100000000, 0x0000000100000000)
eden space 262144K, 62% used [0x00000000eab00000,0x00000000f49c4c90,0x00000000fab00000)
from space 43520K, 0% used [0x00000000fab00000,0x00000000fab00000,0x00000000fd580000)
to space 43520K, 0% used [0x00000000fd580000,0x00000000fd580000,0x0000000100000000)
ParOldGen total 699392K, used 699338K [0x00000000c0000000, 0x00000000eab00000, 0x00000000eab00000)
object space 699392K, 99% used [0x00000000c0000000,0x00000000eaaf28d0,0x00000000eab00000)
Metaspace used 2800K, capacity 4486K, committed 4864K, reserved 1056768K
class space used 305K, capacity 386K, committed 512K, reserved 1048576K
Running with enough memory using -XX:+PrintGCDetails -Xmx5g -Xms5g
, program completes normally and fast:
0: 0 ms
1000000: 18 ms
2000000: 35 ms
3000000: 47 ms
4000000: 55 ms
5000000: 69 ms
6000000: 77 ms
7000000: 95 ms
8000000: 102 ms
9000000: 110 ms
10000000: 132 ms
11000000: 140 ms
12000000: 148 ms
13000000: 156 ms
14000000: 185 ms
15000000: 193 ms
16000000: 201 ms
17000000: 209 ms
18000000: 217 ms
19000000: 224 ms
20000000: 232 ms
21000000: 273 ms
22000000: 281 ms
23000000: 289 ms
24000000: 297 ms
25000000: 305 ms
26000000: 313 ms
27000000: 320 ms
28000000: 330 ms
29000000: 338 ms
30000000: 346 ms
31000000: 364 ms
[GC (Allocation Failure) [PSYoungGen: 1145973K->218085K(1529344K)] 1145973K->854725K(5024768K), 0.5204458 secs] [Times: user=3.50 sys=0.20, real=0.52 secs]
32000000: 922 ms
33000000: 927 ms
34000000: 931 ms
35000000: 936 ms
36000000: 940 ms
37000000: 945 ms
38000000: 949 ms
39000000: 953 ms
40000000: 957 ms
41000000: 962 ms
42000000: 966 ms
43000000: 970 ms
44000000: 975 ms
45000000: 979 ms
46000000: 984 ms
47000000: 1028 ms
48000000: 1032 ms
49000000: 1037 ms
50000000: 1041 ms
51000000: 1045 ms
52000000: 1049 ms
53000000: 1053 ms
54000000: 1057 ms
55000000: 1061 ms
56000000: 1065 ms
57000000: 1069 ms
58000000: 1073 ms
59000000: 1077 ms
60000000: 1085 ms
61000000: 1093 ms
62000000: 1101 ms
63000000: 1109 ms
64000000: 1117 ms
65000000: 1125 ms
66000000: 1133 ms
67000000: 1141 ms
[GC (Allocation Failure) [PSYoungGen: 1529317K->218096K(1529344K)] 2165957K->1851537K(5024768K), 0.9403854 secs] [Times: user=5.70 sys=1.11, real=0.94 secs]
68000000: 2088 ms
69000000: 2094 ms
70000000: 2099 ms
71000000: 2165 ms
72000000: 2169 ms
73000000: 2173 ms
74000000: 2177 ms
75000000: 2183 ms
76000000: 2188 ms
77000000: 2193 ms
78000000: 2199 ms
79000000: 2204 ms
80000000: 2209 ms
81000000: 2214 ms
82000000: 2219 ms
83000000: 2225 ms
84000000: 2230 ms
85000000: 2235 ms
86000000: 2240 ms
87000000: 2244 ms
88000000: 2248 ms
89000000: 2252 ms
90000000: 2256 ms
91000000: 2260 ms
92000000: 2265 ms
93000000: 2270 ms
94000000: 2275 ms
95000000: 2280 ms
96000000: 2286 ms
97000000: 2291 ms
98000000: 2296 ms
99000000: 2302 ms
Total : 2307 ms
Heap
PSYoungGen total 1529344K, used 1423154K [0x0000000755580000, 0x00000007c0000000, 0x00000007c0000000)
eden space 1311232K, 91% used [0x0000000755580000,0x000000079ee50960,0x00000007a5600000)
from space 218112K, 99% used [0x00000007b2b00000,0x00000007bfffc010,0x00000007c0000000)
to space 218112K, 0% used [0x00000007a5600000,0x00000007a5600000,0x00000007b2b00000)
ParOldGen total 3495424K, used 1633441K [0x0000000680000000, 0x0000000755580000, 0x0000000755580000)
object space 3495424K, 46% used [0x0000000680000000,0x00000006e3b28508,0x0000000755580000)
Metaspace used 2802K, capacity 4486K, committed 4864K, reserved 1056768K
class space used 302K, capacity 386K, committed 512K, reserved 1048576K
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With