For an experiment, I made this little program. It just generates 10 million random strings and adds them to an arraylist. Notice that the arraylist does not have an initial capacity.
// editors note: added the necessary boilerplate to run,
// and take initial capacity as an optional cmdline arg for easier testing
import java.util.ArrayList;
import java.util.List;
import java.util.Random;
class ArrayListTest {
public static void main(String[] args)
{
int initsize = -1;
if (args.length > 0) {
initsize = Integer.parseInt(args[0]);
}
long startTime = System.currentTimeMillis();
List<String> randNums = initsize>=0 ? new ArrayList<>(initsize) : new ArrayList<>();
// final List<String> randNums = initsize>=0 ? new ArrayList<String>(initsize) : new ArrayList<String>();
Random r = new Random(1);
for (int i = 0; i < 10_000_000; i++) {
final int randomNum = r.nextInt();
randNums.add(Integer.toString(randomNum));
}
System.out.println(System.currentTimeMillis() - startTime);
}
}
I ran it 5 times, and the results in ms were:
8917
8720
7814
8768
8867
I then altered the program to give the ArrayList an initial capacity:
List<String> randNums = new ArrayList<>(10_000_000);
I again ran it 5 times, and these were the results:
11562
10454
10499
10481
10415
It definitely consistently got slower. I assumed it would have been the opposite, since by declaring a big enough initial size, I took away all the overhead of the ArrayList increasing its own capacity. Why was it slower?
Further Info: Jre 1.8.051, 64-bit (On windows 10);
Default initial capacity of ArrayList is 10. java. util.
The grow method in the ArrayList class gives the new size array. In Java 8 and later The new capacity is calculated which is 50% more than the old capacity and the array is increased by that capacity.
If you know in advance what the size of the ArrayList is going to be, it is more efficient to specify the initial capacity. If you don't do this, the internal array will have to be repeatedly reallocated as the list grows. The larger the final list, the more time you save by avoiding the reallocations.
Capacity is always greater than or equal to Count. If Count exceeds Capacity while adding elements, the capacity is automatically increased by reallocating the internal array before copying the old elements and adding the new elements.
You'd think it was the other way around, but it has to do with Garbage Collection.
I didn't see the big difference you did (3900 vs 5100), but since this is GC related, you're probably running with lower memory. I ran with 64-bit and -Xmx4g
.
Turning on the GC log (-Xloggc:path\to\file.log
), I get this without size:
Java HotSpot(TM) 64-Bit Server VM (25.51-b03) for windows-amd64 JRE (1.8.0_51-b16), built on Jun 8 2015 18:03:07 by "java_re" with MS VC++ 10.0 (VS2010)
Memory: 4k page, physical 33478384k(25822824k free), swap 33476532k(26121788k free)
CommandLine flags: -XX:InitialHeapSize=535654144 -XX:MaxHeapSize=4294967296 -XX:+PrintGC -XX:+PrintGCTimeStamps -XX:+UseCompressedClassPointers -XX:+UseCompressedOops -XX:-UseLargePagesIndividualAllocation -XX:+UseParallelGC
0.188: [GC (Allocation Failure) 131584K->114906K(502784K), 0.0795857 secs]
0.358: [GC (Allocation Failure) 246490K->229080K(634368K), 0.0794026 secs]
0.631: [GC (Allocation Failure) 492248K->452871K(716288K), 0.1389293 secs]
0.770: [Full GC (Ergonomics) 452871K->451407K(1188864K), 3.3224726 secs]
4.270: [GC (Allocation Failure) 714575K->714179K(1271808K), 0.2607092 secs]
4.531: [Full GC (Ergonomics) 714179K->814K(1050624K), 0.0070693 secs]
And I get this with the size:
Java HotSpot(TM) 64-Bit Server VM (25.51-b03) for windows-amd64 JRE (1.8.0_51-b16), built on Jun 8 2015 18:03:07 by "java_re" with MS VC++ 10.0 (VS2010)
Memory: 4k page, physical 33478384k(25818308k free), swap 33476532k(26123684k free)
CommandLine flags: -XX:InitialHeapSize=535654144 -XX:MaxHeapSize=4294967296 -XX:+PrintGC -XX:+PrintGCTimeStamps -XX:+UseCompressedClassPointers -XX:+UseCompressedOops -XX:-UseLargePagesIndividualAllocation -XX:+UseParallelGC
0.171: [GC (Allocation Failure) 131584K->129070K(502784K), 0.0919490 secs]
0.370: [GC (Allocation Failure) 260654K->261166K(634368K), 0.4433150 secs]
0.813: [Full GC (Ergonomics) 261166K->260351K(899072K), 1.4135289 secs]
2.460: [GC (Allocation Failure) 523519K->524487K(899072K), 0.7901689 secs]
3.250: [Full GC (Ergonomics) 524487K->523517K(1421312K), 2.3177831 secs]
Because the second run allocated so much more memory initially, the GC runs get slower. That apparently outweighs the extra array copying going on when the list resizes.
List randNums with 10 million elements requires about 700MB Memory Space. In order to avoid the effects of GC(For sure, it means much in this test), I set the Hotspot VM arguments like this:
-XX:+PrintGC
-XX:+PrintGCTimeStamps
-XX:+UseParNewGC
-XX:+UseConcMarkSweepGC
-Xmx1000m
-Xms1000m
-Xmn999m
-XX:SurvivorRatio=65535
make Young generation large enough to save all elements and make no GC during elements allocation. I make Eden Region of Young Generation larger to avoid elements copy within Young Generation.
The result is surprising!. Total execution time decreases from 8s to 0.6s!
Here, I did some extra work for the questioner, that is testing whether or not pre-allocation of ArrayList can save time and how much it helps.
Here is my code:
long startTime;
List<String> randNums;
Random r = new Random(1);
System.out.println("-----------------------------ArrayList With Enough Capacity Allocated:----------");
for(int loop=0;loop<5;loop++) {
startTime = System.currentTimeMillis();
randNums = new ArrayList<String>(SIZE);
for (int i = 0; i <SIZE ; i++) {
int randomNum = r.nextInt();
randNums.add(Integer.toString(randomNum));
}
System.out.println(System.currentTimeMillis() - startTime); //print time of this loop
randNums.clear();
System.gc();
}
System.out.println("\n-----------------------------ArrayList Auto-Capacity:----------");
for(int loop=0;loop<5;loop++) {
startTime = System.currentTimeMillis();
randNums = new ArrayList<String>();
for (int i = 0; i <SIZE ; i++) {
int randomNum = r.nextInt();
randNums.add(Integer.toString(randomNum));
}
System.out.println(System.currentTimeMillis() - startTime); //print time of this loop
randNums.clear();
System.gc();
}
The output is:
-----------------------------ArrayList With Enough Capacity Allocated:----------
625
0.712: [Full GC (System.gc()) 714143K->39628K(1023936K), 0.1450449 secs]
0.863: [GC (CMS Initial Mark) 98268K(1023936K), 0.0549729 secs]
545
1.413: [Full GC (System.gc()) 705185K->564K(1023936K), 0.1239084 secs]
483
2.031: [Full GC (System.gc()) 679570K->564K(1023936K), 0.1256323 secs]
2.160: [GC (CMS Initial Mark) 59357K(1023936K), 0.0274108 secs]
523
2.688: [Full GC (System.gc()) 670987K->564K(1023936K), 0.1222910 secs]
482
3.302: [Full GC (System.gc()) 673223K->564K(1023936K), 0.1299938 secs]
-----------------------------ArrayList Auto-Capacity:----------
3.432: [GC (CMS Initial Mark) 40961K(1023936K), 0.0003740 secs]
3.907: [GC (CMS Final Remark) 698381K(1023936K), 0.1091347 secs]
796
4.240: [Full GC (System.gc()) 911984K->56183K(1023936K), 0.1719540 secs]
4.412: [GC (CMS Initial Mark) 56183K(1023936K), 0.0394210 secs]
4.770: [GC (CMS Final Remark) 528894K(1023936K), 0.0726012 secs]
690
5.111: [Full GC (System.gc()) 893818K->2060K(1023936K), 0.1364215 secs]
5.248: [GC (CMS Initial Mark) 20769K(1023936K), 0.0008902 secs]
5.750: [GC (CMS Final Remark) 694113K(1023936K), 0.1124856 secs]
704
5.962: [Full GC (System.gc()) 808646K->2081K(1023936K), 0.1338328 secs]
6.096: [GC (CMS Initial Mark) 21137K(1023936K), 0.0010118 secs]
6.599: [GC (CMS Final Remark) 688155K(1023936K), 0.0899929 secs]
661
6.767: [Full GC (System.gc()) 810872K->2081K(1023936K), 0.1287272 secs]
6.896: [GC (CMS Initial Mark) 21512K(1023936K), 0.0010619 secs]
7.398: [GC (CMS Final Remark) 691216K(1023936K), 0.1083076 secs]
681
7.586: [Full GC (System.gc()) 803590K->2081K(1023936K), 0.1269813 secs]
7.714: [GC (CMS Initial Mark) 2081K(1023936K), 0.0008965 secs]
Striping GC info, it is:
-----------------------------ArrayList With Enough Capacity Allocated:----------
615
540
480
511
480
-----------------------------ArrayList Auto-Capacity:----------
680
708
640
644
663
We uses last three data of each group calculate the optimization(to avoid JIT and VM optimization). The answer comes like this:
(480+511+511)/(640+644+663) = 1502/1947 = 501/639 = 100/128
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With