I came into this problemn today and I could not figure out why groovy array is not scaling better than Map when it gets bigger.
In my example I create an Map (LinkedHashMap) and an String array (String[]). Then I iterate from 0 to 10^7 inserting i into the Map or Array. I do it 10 times to be sure that outliers don't mess the results.
int max = 10**7
int numTests = 10
long totalTimeMap = 0
long totalTimeArray = 0
numTests.times{
long start = System.currentTimeMillis()
Map m = [:]
max.times {
m[it] = "${it}"
}
long end = System.currentTimeMillis()
totalTimeMap += (end-start)
}
numTests.times {
long start = System.currentTimeMillis()
String[] s = new String[max]
max.times{
s[it] = "${it}"
}
long end = System.currentTimeMillis()
totalTimeArray += (end-start)
}
println "Map: ${totalTimeMap}"
println "Array: ${totalTimeArray}"
The output was unexpected since the Map had a better performance then Array:
Map: 49361
Array: 101123
I made the same experiment in java:
public static void main(String[] args) {
int max = 10000000;
int numTests = 10;
long totalTimeMap = 0;
long totalTimeArray = 0;
for(int i=0; i<numTests; i++){
long start = System.currentTimeMillis();
Map m = new LinkedHashMap();
for(int j=0; j<max; j++){
m.put(j, "" + j);
}
long end = System.currentTimeMillis();
totalTimeMap += (end-start);
}
for(int i=0; i<numTests; i++){
long start = System.currentTimeMillis();
String[] s = new String[max];
for(int j=0; j<max; j++){
s[j] = "" + j;
}
long end = System.currentTimeMillis();
totalTimeArray += (end-start);
}
System.out.println("Map: " + totalTimeMap);
System.out.println("Array: " + totalTimeArray);
}
and the output was expected (Array faster than Map):
Map: 34564
Array: 12822
My question is: why the Map is faster than the Array when using Groovy?
When you are adding the String to the array in Groovy, you're creating a templated String, which is then getting converted back to a java string (after the templating is done) as it has to fit in a String[]
For the Map
version, you're just storing a templated string, so no evaluation has to be done...
The following benchmarking code:
@Grab('org.gperfutils:gbench:0.4.3-groovy-2.4')
int max = 10000
new groovyx.gbench.BenchmarkBuilder().run {
'Array' {
String[] s = new String[max]
max.times { int idx ->
s[idx] = Integer.toString(idx)
}
}
'List' {
def s = []
max.times{
s << "${it}"
}
}
'Map' {
Map m = [:]
max.times {
m[it] = "${it}"
}
}
}.prettyPrint()
Where we don't use GroovyStrings in the Array method, gives me the result:
* Groovy: 2.4.3
* JVM: Java HotSpot(TM) 64-Bit Server VM (25.45-b02, Oracle Corporation)
* JRE: 1.8.0_45
* Total Memory: 800.5 MB
* Maximum Memory: 1820.5 MB
* OS: Mac OS X (10.10.3, x86_64)
Options
=======
* Warm Up: Auto (- 60 sec)
* CPU Time Measurement: On
user system cpu real
Array 1819502 6491 1825993 1833209
List 1697948 6533 1704481 1724448
Map 2040521 8932 2049453 2116760
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With