I am writing a micro-benchmark to compare String concatenation using + operator vs StringBuilder. To this aim, I created a JMH benchmark class based on OpenJDK example that uses the batchSize parameter:
@State(Scope.Thread)
@BenchmarkMode(Mode.AverageTime)
@Measurement(batchSize = 10000, iterations = 10)
@Warmup(batchSize = 10000, iterations = 10)
@Fork(1)
public class StringConcatenationBenchmark {
private String string;
private StringBuilder stringBuilder;
@Setup(Level.Iteration)
public void setup() {
string = "";
stringBuilder = new StringBuilder();
}
@Benchmark
public void stringConcatenation() {
string += "some more data";
}
@Benchmark
public void stringBuilderConcatenation() {
stringBuilder.append("some more data");
}
}
When I run the benchmark I get the following error for stringBuilderConcatenation
method:
java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:3332)
at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:137)
at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:121)
at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:421)
at java.lang.StringBuilder.append(StringBuilder.java:136)
at link.pellegrino.string_concatenation.StringConcatenationBenchmark.stringBuilderConcatenation(StringConcatenationBenchmark.java:29)
at link.pellegrino.string_concatenation.generated.StringConcatenationBenchmark_stringBuilderConcatenation.stringBuilderConcatenation_avgt_jmhStub(StringConcatenationBenchmark_stringBuilderConcatenation.java:165)
at link.pellegrino.string_concatenation.generated.StringConcatenationBenchmark_stringBuilderConcatenation.stringBuilderConcatenation_AverageTime(StringConcatenationBenchmark_stringBuilderConcatenation.java:130)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.openjdk.jmh.runner.BenchmarkHandler$BenchmarkTask.call(BenchmarkHandler.java:430)
at org.openjdk.jmh.runner.BenchmarkHandler$BenchmarkTask.call(BenchmarkHandler.java:412)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
I was thinking that the default JVM heap size has to be increased, so I tried to allow up to 10GB using -Xmx10G
value with -jvmArgs
option provided by JMH. Unfortunately, I still get the error.
Consequently, I tried to reduce the value for batchSize
parameter to 1 but I still get an OutOfMemoryError.
The only workaround I have found is to set the benchmark mode to Mode.SingleShotTime
. Since this mode seems to consider a batch as a single shot (even if s/op is displayed in the Units column), it seems that I get the metric I want: the average time to perform the set of batch operations. However, I still don't understand why it is not working with Mode.AverageTime
.
Please also note that the benchmarks for method stringConcatenation
work as expected whatever the benchmark mode is used. The issue only occurs with stringBuilderConcatenation
method that makes use of StringBuilder.
Any help to understand why the previous example is not working with Benchmark mode set to Mode.AverageTime
is welcome.
JMH version I used is 1.10.4.
You're right that Mode.SingleShotTime
is what you need: it measures the time for single batch. When using the Mode.AverageTime
your iteration still works until the iteration time finishes (which is 1 second by default). It measures the time per executing the single batch (only batches which were fully finished during the execution time are counted), so the final results differ, but execution time is the same.
Another problem is that @Setup(Level.Iteration)
forces setup to be executed before every iteration, but not before every batch. Thus your strings are not actually limited by the batch size. The string version does not cause the OutOfMemoryError
just because it's much slower than StringBuilder
, so during the 1 second it's capable to build much shorter string.
Not very beautiful way to fix your benchmark (while still using average time mode and batchSize parameter) is to reset the string/stringBuilder manually:
@State(Scope.Thread)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.MICROSECONDS)
@Measurement(batchSize = 10000, iterations = 10)
@Warmup(batchSize = 10000, iterations = 10)
@Fork(1)
public class StringConcatenationBenchmark {
private static final String S = "some more data";
private static final int maxLen = S.length()*10000;
private String string;
private StringBuilder stringBuilder;
@Setup(Level.Iteration)
public void setup() {
string = "";
stringBuilder = new StringBuilder();
}
@Benchmark
public void stringConcatenation() {
if(string.length() >= maxLen) string = "";
string += S;
}
@Benchmark
public void stringBuilderConcatenation() {
if(stringBuilder.length() >= maxLen) stringBuilder = new StringBuilder();
stringBuilder.append(S);
}
}
Here's results on my box (i5 3340, 4Gb RAM, 64bit Win7, JDK 1.8.0_45):
Benchmark Mode Cnt Score Error Units
stringBuilderConcatenation avgt 10 145.997 ± 2.301 us/op
stringConcatenation avgt 10 324878.341 ± 39824.738 us/op
So you can see that only about 3 batches fit the second for stringConcatenation
(1e6/324878
) while for stringBuilderConcatenation
thousands of batches can be executed resulting in enormous string leading to OutOfMemoryError
.
I don't know why adding more memory doesn't work for you, for me -Xmx4G
is enough to run the stringBuilder test of your original benchmark. Probably your box is faster, so the resulting string is even longer. Note that for the very big string you can hit the array size limit (2 billion of elements) even if you have enough memory. Check the exception stacktrace after adding the memory: is it the same? If you hit the array size limit, it will still be OutOfMemoryError
, but stacktrace will be different a little bit. Anyways even with enough memory the results for your benchmark will be incorrect (both for String
and StringBuilder
).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With