While investigating for a little debate w.r.t. using <code>"" + n</code> and <code>Integer.toString(int)</code> to convert an integer primitive to a string I wrote this JMH microbenchmark: <pre class="prettyprint"><code>@Fork(1) @OutputTimeUnit(TimeUnit.MILLISECONDS) @State(Scope.Benchmark) public class IntStr { protected int counter; @GenerateMicroBenchmark public String integerToString() { return Integer.toString(this.counter++); } @GenerateMicroBenchmark public String stringBuilder0() { return new StringBuilder().append(this.counter++).toString(); } @GenerateMicroBenchmark public String stringBuilder1() { return new StringBuilder().append("").append(this.counter++).toString(); } @GenerateMicroBenchmark public String stringBuilder2() { return new StringBuilder().append("").append(Integer.toString(this.counter++)).toString(); } @GenerateMicroBenchmark public String stringFormat() { return String.format("%d", this.counter++); } @Setup(Level.Iteration) public void prepareIteration() { this.counter = 0; } } </code></pre> I ran it with the default JMH options with both Java VMs that exist on my Linux machine (up-to-date Mageia 4 64-bit, Intel i7-3770 CPU, 32GB RAM). The first JVM was the one supplied with Oracle JDK 8u5 64-bit: <pre class="prettyprint"><code>java version "1.8.0_05" Java(TM) SE Runtime Environment (build 1.8.0_05-b13) Java HotSpot(TM) 64-Bit Server VM (build 25.5-b02, mixed mode) </code></pre> With this JVM I got pretty much what I expected: <pre class="prettyprint"><code>Benchmark Mode Samples Mean Mean error Units b.IntStr.integerToString thrpt 20 32317.048 698.703 ops/ms b.IntStr.stringBuilder0 thrpt 20 28129.499 421.520 ops/ms b.IntStr.stringBuilder1 thrpt 20 28106.692 1117.958 ops/ms b.IntStr.stringBuilder2 thrpt 20 20066.939 1052.937 ops/ms b.IntStr.stringFormat thrpt 20 2346.452 37.422 ops/ms </code></pre> I.e. using the <code>StringBuilder</code> class is slower due to the additional overhead of creating the <code>StringBuilder</code> object and appending an empty string. Using <code>String.format(String, ...)</code> is even slower, by an order of magnitude or so. The distribution-provided compiler, on the other hand, is based on OpenJDK 1.7: <pre class="prettyprint"><code>java version "1.7.0_55" OpenJDK Runtime Environment (mageia-2.4.7.1.mga4-x86_64 u55-b13) OpenJDK 64-Bit Server VM (build 24.51-b03, mixed mode) </code></pre> The results here were interesting: <pre class="prettyprint"><code>Benchmark Mode Samples Mean Mean error Units b.IntStr.integerToString thrpt 20 31249.306 881.125 ops/ms b.IntStr.stringBuilder0 thrpt 20 39486.857 663.766 ops/ms b.IntStr.stringBuilder1 thrpt 20 41072.058 484.353 ops/ms b.IntStr.stringBuilder2 thrpt 20 20513.913 466.130 ops/ms b.IntStr.stringFormat thrpt 20 2068.471 44.964 ops/ms </code></pre> Why does <code>StringBuilder.append(int)</code> appear so much faster with this JVM? Looking at the <code>StringBuilder</code> class source code revealed nothing particularly interesting - the method in question is almost identical to <code>Integer#toString(int)</code>. Interestingly enough, appending the result of <code>Integer.toString(int)</code> (the <code>stringBuilder2</code> microbenchmark) does not appear to be faster. Is this performance discrepancy an issue with the testing harness? Or does my OpenJDK JVM contain optimizations that would affect this particular code (anti)-pattern? EDIT: For a more straight-forward comparison, I installed Oracle JDK 1.7u55: <pre class="prettyprint"><code>java version "1.7.0_55" Java(TM) SE Runtime Environment (build 1.7.0_55-b13) Java HotSpot(TM) 64-Bit Server VM (build 24.55-b03, mixed mode) </code></pre> The results are similar to those of OpenJDK: <pre class="prettyprint"><code>Benchmark Mode Samples Mean Mean error Units b.IntStr.integerToString thrpt 20 32502.493 501.928 ops/ms b.IntStr.stringBuilder0 thrpt 20 39592.174 428.967 ops/ms b.IntStr.stringBuilder1 thrpt 20 40978.633 544.236 ops/ms </code></pre> It seems that this is a more general Java 7 vs Java 8 issue. Perhaps Java 7 had more aggressive string optimizations? EDIT 2: For completeness, here are the string-related VM options for both of these JVMs: For Oracle JDK 8u5: <pre class="prettyprint"><code>$ /usr/java/default/bin/java -XX:+PrintFlagsFinal 2>/dev/null | grep String bool OptimizeStringConcat = true {C2 product} intx PerfMaxStringConstLength = 1024 {product} bool PrintStringTableStatistics = false {product} uintx StringTableSize = 60013 {product} </code></pre> For OpenJDK 1.7: <pre class="prettyprint"><code>$ java -XX:+PrintFlagsFinal 2>/dev/null | grep String bool OptimizeStringConcat = true {C2 product} intx PerfMaxStringConstLength = 1024 {product} bool PrintStringTableStatistics = false {product} uintx StringTableSize = 60013 {product} bool UseStringCache = false {product} </code></pre> The <code>UseStringCache</code> option was removed in Java 8 with no replacement, so I doubt that makes any difference. The rest of the options appear to have the same settings. EDIT 3: A side-by-side comparison of the source code of the <code>AbstractStringBuilder</code>, <code>StringBuilder</code> and <code>Integer</code> classes from the <code>src.zip</code> file of reveals nothing noteworty. Apart from a whole lot of cosmetic and documentation changes, <code>Integer</code> now has some support for unsigned integers and <code>StringBuilder</code> has been slightly refactored to share more code with <code>StringBuffer</code>. None of these changes seem to affect the code paths used by <code>StringBuilder#append(int)</code>, although I may have missed something. A comparison of the assembly code generated for <code>IntStr#integerToString()</code> and <code>IntStr#stringBuilder0()</code> is far more interesting. The basic layout of the code generated for <code>IntStr#integerToString()</code> was similar for both JVMs, although Oracle JDK 8u5 seemed to be more aggressive w.r.t. inlining some calls within the <code>Integer#toString(int)</code> code. There was a clear correspondence with the Java source code, even for someone with minimal assembly experience. The assembly code for <code>IntStr#stringBuilder0()</code>, however, was radically different. The code generated by Oracle JDK 8u5 was once again directly related to the Java source code - I could easily recognise the same layout. On the contrary, the code generated by OpenJDK 7 was almost unrecognisable to the untrained eye (like mine). The <code>new StringBuilder()</code> call was seemingly removed, as was the creation of the array in the <code>StringBuilder</code> constructor. Additionaly, the disassembler plugin was not able to provide as many references to the source code as it did in JDK 8. I assume that this is either the result of a much more aggressive optimization pass in OpenJDK 7, or more probably the result of inserting hand-written low-level code for certain <code>StringBuilder</code> operations. I am unsure why this optimization does not happen in my JVM 8 implementation or why the same optimizations were not implemented for <code>Integer#toString(int)</code> in JVM 7. I guess someone familiar with the related parts of the JRE source code would have to answer these questions...

TL;DR: Side effects in <code>append</code> apparently break StringConcat optimizations. Very good analysis in the original question and updates! For completeness, below are a few missing steps: <ul> <li> See through the <code>-XX:+PrintInlining</code> for both 7u55 and 8u5. In 7u55, you will see something like this: <blockquote> <pre class="prettyprint"><code> @ 16 org.sample.IntStr::inlineSideEffect (25 bytes) force inline by CompilerOracle @ 4 java.lang.StringBuilder::<init> (7 bytes) inline (hot) @ 18 java.lang.StringBuilder::append (8 bytes) already compiled into a big method @ 21 java.lang.StringBuilder::toString (17 bytes) inline (hot) </code></pre> </blockquote> ...and in 8u5: <blockquote> <pre class="prettyprint"><code> @ 16 org.sample.IntStr::inlineSideEffect (25 bytes) force inline by CompilerOracle @ 4 java.lang.StringBuilder::<init> (7 bytes) inline (hot) @ 3 java.lang.AbstractStringBuilder::<init> (12 bytes) inline (hot) @ 1 java.lang.Object::<init> (1 bytes) inline (hot) @ 18 java.lang.StringBuilder::append (8 bytes) inline (hot) @ 2 java.lang.AbstractStringBuilder::append (62 bytes) already compiled into a big method @ 21 java.lang.StringBuilder::toString (17 bytes) inline (hot) @ 13 java.lang.String::<init> (62 bytes) inline (hot) @ 1 java.lang.Object::<init> (1 bytes) inline (hot) @ 55 java.util.Arrays::copyOfRange (63 bytes) inline (hot) @ 54 java.lang.Math::min (11 bytes) (intrinsic) @ 57 java.lang.System::arraycopy (0 bytes) (intrinsic) </code></pre> </blockquote> You might notice that 7u55 version is shallower, and it looks like nothing is called after <code>StringBuilder</code> methods -- this is a good indication the string optimizations are in effect. Indeed, if you run 7u55 with <code>-XX:-OptimizeStringConcat</code>, the subcalls will reappear, and performance drops to 8u5 levels. </li> <li>OK, so we need to figure out why 8u5 does not do the same optimization. Grep http://hg.openjdk.java.net/jdk9/jdk9/hotspot for "StringBuilder" to figure out where VM handles the StringConcat optimization; this will get you into <code>src/share/vm/opto/stringopts.cpp</code></li> <li> <code>hg log src/share/vm/opto/stringopts.cpp</code> to figure out the latest changes there. One of the candidates would be: <blockquote> <pre class="prettyprint"><code>changeset: 5493:90abdd727e64 user: iveresov date: Wed Oct 16 11:13:15 2013 -0700 summary: 8009303: Tiered: incorrect results in VM tests stringconcat... </code></pre> </blockquote> </li> <li>Look for the review threads on OpenJDK mailing lists (easy enough to google for changeset summary): http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2013-October/012084.html</li> <li>Spot "String concat optimization optimization collapses the pattern [...] into a single allocation of a string and forming the result directly. All possible deopts that may happen in the optimized code restart this pattern from the beginning (starting from the StringBuffer allocation). That means that the whole pattern must me side-effect free." Eureka?</li> <li> Write out the contrasting benchmark: <blockquote> <pre class="prettyprint"><code>@Fork(5) @Warmup(iterations = 5) @Measurement(iterations = 5) @BenchmarkMode(Mode.AverageTime) @OutputTimeUnit(TimeUnit.NANOSECONDS) @State(Scope.Benchmark) public class IntStr { private int counter; @GenerateMicroBenchmark public String inlineSideEffect() { return new StringBuilder().append(counter++).toString(); } @GenerateMicroBenchmark public String spliceSideEffect() { int cnt = counter++; return new StringBuilder().append(cnt).toString(); } } </code></pre> </blockquote> </li> <li> Measure it on JDK 7u55, seeing the same performance for inlined/spliced side effects: <blockquote> <pre class="prettyprint"><code>Benchmark Mode Samples Mean Mean error Units o.s.IntStr.inlineSideEffect avgt 25 65.460 1.747 ns/op o.s.IntStr.spliceSideEffect avgt 25 64.414 1.323 ns/op </code></pre> </blockquote> </li> <li> Measure it on JDK 8u5, seeing the performance degradation with the inlined effect: <blockquote> <pre class="prettyprint"><code>Benchmark Mode Samples Mean Mean error Units o.s.IntStr.inlineSideEffect avgt 25 84.953 2.274 ns/op o.s.IntStr.spliceSideEffect avgt 25 65.386 1.194 ns/op </code></pre> </blockquote> </li> <li>Submit the bug report (https://bugs.openjdk.java.net/browse/JDK-8043677) to discuss this behavior with VM guys. The rationale for original fix is rock solid, it is interesting however if we can/should get back this optimization in some trivial cases like these.</li> <li>???</li> <li>PROFIT.</li> </ul> And yeah, I should post the results for the benchmark which moves the increment from the <code>StringBuilder</code> chain, doing it before the entire chain. Also, switched to average time, and ns/op. This is JDK 7u55: <blockquote> <pre class="prettyprint"><code>Benchmark Mode Samples Mean Mean error Units o.s.IntStr.integerToString avgt 25 153.805 1.093 ns/op o.s.IntStr.stringBuilder0 avgt 25 128.284 6.797 ns/op o.s.IntStr.stringBuilder1 avgt 25 131.524 3.116 ns/op o.s.IntStr.stringBuilder2 avgt 25 254.384 9.204 ns/op o.s.IntStr.stringFormat avgt 25 2302.501 103.032 ns/op </code></pre> </blockquote> And this is 8u5: <blockquote> <pre class="prettyprint"><code>Benchmark Mode Samples Mean Mean error Units o.s.IntStr.integerToString avgt 25 153.032 3.295 ns/op o.s.IntStr.stringBuilder0 avgt 25 127.796 1.158 ns/op o.s.IntStr.stringBuilder1 avgt 25 131.585 1.137 ns/op o.s.IntStr.stringBuilder2 avgt 25 250.980 2.773 ns/op o.s.IntStr.stringFormat avgt 25 2123.706 25.105 ns/op </code></pre> </blockquote> <code>stringFormat</code> is actually a bit faster in 8u5, and all other tests are the same. This solidifies the hypothesis the side-effect breakage in SB chains in the major culprit in the original question.

I think this has to do with the <code>CompileThreshold</code> flag which controls when the byte code is compiled into machine code by JIT. The Oracle JDK has a default count of 10,000 as document at http://www.oracle.com/technetwork/java/javase/tech/vmoptions-jsp-140102.html. Where OpenJDK I couldn't find a latest document on this flag; but some mail threads suggest a much lower threshold: http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2010-November/004239.html Also, try turn on / off the Oracle JDK flags like <code>-XX:+UseCompressedStrings</code> and <code>-XX:+OptimizeStringConcat</code>. I am not sure if those flags are turned on by default on OpenJDK though. Could someone please suggest. One experiement you can do, is to firstly run the program by a lot of times, say, 30,000 loops, do a System.gc() and then try to look at the performance. I believe they would yield the same. And I assume your GC setting is the same too. Otherwise you are allocating a lot of objects and the GC might well be the major part of your run time.

Why is StringBuilder#append(int) faster in Java 7 than in Java 8?

Tags:

java

performance

java-8

java-7

jmh

While investigating for a little debate w.r.t. using "" + n and Integer.toString(int) to convert an integer primitive to a string I wrote this JMH microbenchmark:

@Fork(1)
@OutputTimeUnit(TimeUnit.MILLISECONDS)
@State(Scope.Benchmark)
public class IntStr {
    protected int counter;


    @GenerateMicroBenchmark
    public String integerToString() {
        return Integer.toString(this.counter++);
    }

    @GenerateMicroBenchmark
    public String stringBuilder0() {
        return new StringBuilder().append(this.counter++).toString();
    }

    @GenerateMicroBenchmark
    public String stringBuilder1() {
        return new StringBuilder().append("").append(this.counter++).toString();
    }

    @GenerateMicroBenchmark
    public String stringBuilder2() {
        return new StringBuilder().append("").append(Integer.toString(this.counter++)).toString();
    }

    @GenerateMicroBenchmark
    public String stringFormat() {
        return String.format("%d", this.counter++);
    }

    @Setup(Level.Iteration)
    public void prepareIteration() {
        this.counter = 0;
    }
}

I ran it with the default JMH options with both Java VMs that exist on my Linux machine (up-to-date Mageia 4 64-bit, Intel i7-3770 CPU, 32GB RAM). The first JVM was the one supplied with Oracle JDK 8u5 64-bit:

java version "1.8.0_05"
Java(TM) SE Runtime Environment (build 1.8.0_05-b13)
Java HotSpot(TM) 64-Bit Server VM (build 25.5-b02, mixed mode)

With this JVM I got pretty much what I expected:

Benchmark                    Mode   Samples         Mean   Mean error    Units
b.IntStr.integerToString    thrpt        20    32317.048      698.703   ops/ms
b.IntStr.stringBuilder0     thrpt        20    28129.499      421.520   ops/ms
b.IntStr.stringBuilder1     thrpt        20    28106.692     1117.958   ops/ms
b.IntStr.stringBuilder2     thrpt        20    20066.939     1052.937   ops/ms
b.IntStr.stringFormat       thrpt        20     2346.452       37.422   ops/ms

I.e. using the StringBuilder class is slower due to the additional overhead of creating the StringBuilder object and appending an empty string. Using String.format(String, ...) is even slower, by an order of magnitude or so.

The distribution-provided compiler, on the other hand, is based on OpenJDK 1.7:

java version "1.7.0_55"
OpenJDK Runtime Environment (mageia-2.4.7.1.mga4-x86_64 u55-b13)
OpenJDK 64-Bit Server VM (build 24.51-b03, mixed mode)

The results here were interesting:

Benchmark                    Mode   Samples         Mean   Mean error    Units
b.IntStr.integerToString    thrpt        20    31249.306      881.125   ops/ms
b.IntStr.stringBuilder0     thrpt        20    39486.857      663.766   ops/ms
b.IntStr.stringBuilder1     thrpt        20    41072.058      484.353   ops/ms
b.IntStr.stringBuilder2     thrpt        20    20513.913      466.130   ops/ms
b.IntStr.stringFormat       thrpt        20     2068.471       44.964   ops/ms

Why does StringBuilder.append(int) appear so much faster with this JVM? Looking at the StringBuilder class source code revealed nothing particularly interesting - the method in question is almost identical to Integer#toString(int). Interestingly enough, appending the result of Integer.toString(int) (the stringBuilder2 microbenchmark) does not appear to be faster.

Is this performance discrepancy an issue with the testing harness? Or does my OpenJDK JVM contain optimizations that would affect this particular code (anti)-pattern?

EDIT:

For a more straight-forward comparison, I installed Oracle JDK 1.7u55:

java version "1.7.0_55"
Java(TM) SE Runtime Environment (build 1.7.0_55-b13)
Java HotSpot(TM) 64-Bit Server VM (build 24.55-b03, mixed mode)

The results are similar to those of OpenJDK:

Benchmark                    Mode   Samples         Mean   Mean error    Units
b.IntStr.integerToString    thrpt        20    32502.493      501.928   ops/ms
b.IntStr.stringBuilder0     thrpt        20    39592.174      428.967   ops/ms
b.IntStr.stringBuilder1     thrpt        20    40978.633      544.236   ops/ms

It seems that this is a more general Java 7 vs Java 8 issue. Perhaps Java 7 had more aggressive string optimizations?

EDIT 2:

For completeness, here are the string-related VM options for both of these JVMs:

For Oracle JDK 8u5:

$ /usr/java/default/bin/java -XX:+PrintFlagsFinal 2>/dev/null | grep String
     bool OptimizeStringConcat                      = true            {C2 product}
     intx PerfMaxStringConstLength                  = 1024            {product}
     bool PrintStringTableStatistics                = false           {product}
    uintx StringTableSize                           = 60013           {product}

For OpenJDK 1.7:

$ java -XX:+PrintFlagsFinal 2>/dev/null | grep String
     bool OptimizeStringConcat                      = true            {C2 product}        
     intx PerfMaxStringConstLength                  = 1024            {product}           
     bool PrintStringTableStatistics                = false           {product}           
    uintx StringTableSize                           = 60013           {product}           
     bool UseStringCache                            = false           {product}

The UseStringCache option was removed in Java 8 with no replacement, so I doubt that makes any difference. The rest of the options appear to have the same settings.

EDIT 3:

A side-by-side comparison of the source code of the AbstractStringBuilder, StringBuilder and Integer classes from the src.zip file of reveals nothing noteworty. Apart from a whole lot of cosmetic and documentation changes, Integer now has some support for unsigned integers and StringBuilder has been slightly refactored to share more code with StringBuffer. None of these changes seem to affect the code paths used by StringBuilder#append(int), although I may have missed something.

A comparison of the assembly code generated for IntStr#integerToString() and IntStr#stringBuilder0() is far more interesting. The basic layout of the code generated for IntStr#integerToString() was similar for both JVMs, although Oracle JDK 8u5 seemed to be more aggressive w.r.t. inlining some calls within the Integer#toString(int) code. There was a clear correspondence with the Java source code, even for someone with minimal assembly experience.

The assembly code for IntStr#stringBuilder0(), however, was radically different. The code generated by Oracle JDK 8u5 was once again directly related to the Java source code - I could easily recognise the same layout. On the contrary, the code generated by OpenJDK 7 was almost unrecognisable to the untrained eye (like mine). The new StringBuilder() call was seemingly removed, as was the creation of the array in the StringBuilder constructor. Additionaly, the disassembler plugin was not able to provide as many references to the source code as it did in JDK 8.

I assume that this is either the result of a much more aggressive optimization pass in OpenJDK 7, or more probably the result of inserting hand-written low-level code for certain StringBuilder operations. I am unsure why this optimization does not happen in my JVM 8 implementation or why the same optimizations were not implemented for Integer#toString(int) in JVM 7. I guess someone familiar with the related parts of the JRE source code would have to answer these questions...

853

asked May 20 '14 10:05

thkala

2 Answers

TL;DR: Side effects in append apparently break StringConcat optimizations.

Very good analysis in the original question and updates!

For completeness, below are a few missing steps:

See through the -XX:+PrintInlining for both 7u55 and 8u5. In 7u55, you will see something like this:

 @ 16   org.sample.IntStr::inlineSideEffect (25 bytes)   force inline by CompilerOracle
   @ 4   java.lang.StringBuilder::<init> (7 bytes)   inline (hot)
   @ 18   java.lang.StringBuilder::append (8 bytes)   already compiled into a big method
   @ 21   java.lang.StringBuilder::toString (17 bytes)   inline (hot)

...and in 8u5:

 @ 16   org.sample.IntStr::inlineSideEffect (25 bytes)   force inline by CompilerOracle
   @ 4   java.lang.StringBuilder::<init> (7 bytes)   inline (hot)
     @ 3   java.lang.AbstractStringBuilder::<init> (12 bytes)   inline (hot)
       @ 1   java.lang.Object::<init> (1 bytes)   inline (hot)
   @ 18   java.lang.StringBuilder::append (8 bytes)   inline (hot)
     @ 2   java.lang.AbstractStringBuilder::append (62 bytes)   already compiled into a big method
   @ 21   java.lang.StringBuilder::toString (17 bytes)   inline (hot)
     @ 13   java.lang.String::<init> (62 bytes)   inline (hot)
       @ 1   java.lang.Object::<init> (1 bytes)   inline (hot)
       @ 55   java.util.Arrays::copyOfRange (63 bytes)   inline (hot)
         @ 54   java.lang.Math::min (11 bytes)   (intrinsic)
         @ 57   java.lang.System::arraycopy (0 bytes)   (intrinsic)

You might notice that 7u55 version is shallower, and it looks like nothing is called after StringBuilder methods -- this is a good indication the string optimizations are in effect. Indeed, if you run 7u55 with -XX:-OptimizeStringConcat, the subcalls will reappear, and performance drops to 8u5 levels.

OK, so we need to figure out why 8u5 does not do the same optimization. Grep http://hg.openjdk.java.net/jdk9/jdk9/hotspot for "StringBuilder" to figure out where VM handles the StringConcat optimization; this will get you into src/share/vm/opto/stringopts.cpp

hg log src/share/vm/opto/stringopts.cpp to figure out the latest changes there. One of the candidates would be:

changeset:   5493:90abdd727e64
user:        iveresov
date:        Wed Oct 16 11:13:15 2013 -0700
summary:     8009303: Tiered: incorrect results in VM tests stringconcat...

Look for the review threads on OpenJDK mailing lists (easy enough to google for changeset summary): http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2013-October/012084.html
Spot "String concat optimization optimization collapses the pattern [...] into a single allocation of a string and forming the result directly. All possible deopts that may happen in the optimized code restart this pattern from the beginning (starting from the StringBuffer allocation). That means that the whole pattern must me side-effect free." Eureka?

Write out the contrasting benchmark:

@Fork(5)
@Warmup(iterations = 5)
@Measurement(iterations = 5)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@State(Scope.Benchmark)
public class IntStr {
    private int counter;

    @GenerateMicroBenchmark
    public String inlineSideEffect() {
        return new StringBuilder().append(counter++).toString();
    }

    @GenerateMicroBenchmark
    public String spliceSideEffect() {
        int cnt = counter++;
        return new StringBuilder().append(cnt).toString();
    }
}

Measure it on JDK 7u55, seeing the same performance for inlined/spliced side effects:

Benchmark                       Mode   Samples         Mean   Mean error    Units
o.s.IntStr.inlineSideEffect     avgt        25       65.460        1.747    ns/op
o.s.IntStr.spliceSideEffect     avgt        25       64.414        1.323    ns/op

Measure it on JDK 8u5, seeing the performance degradation with the inlined effect:

Benchmark                       Mode   Samples         Mean   Mean error    Units
o.s.IntStr.inlineSideEffect     avgt        25       84.953        2.274    ns/op
o.s.IntStr.spliceSideEffect     avgt        25       65.386        1.194    ns/op

Submit the bug report (https://bugs.openjdk.java.net/browse/JDK-8043677) to discuss this behavior with VM guys. The rationale for original fix is rock solid, it is interesting however if we can/should get back this optimization in some trivial cases like these.
???
PROFIT.

And yeah, I should post the results for the benchmark which moves the increment from the StringBuilder chain, doing it before the entire chain. Also, switched to average time, and ns/op. This is JDK 7u55:

Benchmark                      Mode   Samples         Mean   Mean error    Units
o.s.IntStr.integerToString     avgt        25      153.805        1.093    ns/op
o.s.IntStr.stringBuilder0      avgt        25      128.284        6.797    ns/op
o.s.IntStr.stringBuilder1      avgt        25      131.524        3.116    ns/op
o.s.IntStr.stringBuilder2      avgt        25      254.384        9.204    ns/op
o.s.IntStr.stringFormat        avgt        25     2302.501      103.032    ns/op

And this is 8u5:

Benchmark                      Mode   Samples         Mean   Mean error    Units
o.s.IntStr.integerToString     avgt        25      153.032        3.295    ns/op
o.s.IntStr.stringBuilder0      avgt        25      127.796        1.158    ns/op
o.s.IntStr.stringBuilder1      avgt        25      131.585        1.137    ns/op
o.s.IntStr.stringBuilder2      avgt        25      250.980        2.773    ns/op
o.s.IntStr.stringFormat        avgt        25     2123.706       25.105    ns/op

stringFormat is actually a bit faster in 8u5, and all other tests are the same. This solidifies the hypothesis the side-effect breakage in SB chains in the major culprit in the original question.

149

answered Oct 20 '22 15:10

Aleksey Shipilev

I think this has to do with the CompileThreshold flag which controls when the byte code is compiled into machine code by JIT.

The Oracle JDK has a default count of 10,000 as document at http://www.oracle.com/technetwork/java/javase/tech/vmoptions-jsp-140102.html.

Where OpenJDK I couldn't find a latest document on this flag; but some mail threads suggest a much lower threshold: http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2010-November/004239.html

Also, try turn on / off the Oracle JDK flags like -XX:+UseCompressedStrings and -XX:+OptimizeStringConcat. I am not sure if those flags are turned on by default on OpenJDK though. Could someone please suggest.

One experiement you can do, is to firstly run the program by a lot of times, say, 30,000 loops, do a System.gc() and then try to look at the performance. I believe they would yield the same.

And I assume your GC setting is the same too. Otherwise you are allocating a lot of objects and the GC might well be the major part of your run time.

answered Oct 20 '22 15:10

Alex Suo

Related questions
                            
                                Android Activity without ActionBar
                            
                                Google Maps V2 - Android - Get the current zoom level
                            
                                The activity must be exported or contain an intent-filter
                            
                                Elegant way to read file into byte[] array in Java [duplicate]
                            
                                Initializing ArrayList with some predefined values [duplicate]
                            
                                Most elegant way to change 0 to 1 and vice versa
                            
                                how to zip a folder itself using java
                            
                                Limiting the number of characters in a string, and chopping off the rest
                            
                                How can a Java variable be different from itself? [duplicate]
                            
                                What's the intended use of IllegalStateException?
                            
                                Difference between extracting and packaging libraries into a jar file
                            
                                Java 8 CompletableFuture.allOf(...) with Collection or List [duplicate]
                            
                                What is a real life example of generic <? super T>?
                            
                                Which JDK version (Language Level) is required for Android Studio?
                            
                                Any way to generate ant build.xml file automatically from Eclipse?
                            
                                How to step one step back in IntelliJ?
                            
                                Getting mail from GMail into Java application using IMAP
                            
                                How do I use 3DES encryption/decryption in Java?
                            
                                What Java XML library do you recommend (to replace dom4j)? [closed]
                            
                                Intellij - can be replaced with method reference

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With