Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

String concatenation in a for loop. Java 9

Please correct me if i'm wrong. In Java 8, for performance reasons, when concatenating several strings by the "+" operator StringBuffer was invoked. And the problem of creating a bunch of intermediate string objects and polluting the string pool was "resolved".

What about Java 9? There'a new feature added as Invokedynamic. And a new class that resolves the problem even better, StringConcatFactory.

String result = "";
List<String> list = Arrays.asList("a", "b", "c");
for (String n : list) {
 result+=n;
}

My question are: How many objects are created in this loop? Are there any intermedier objects? And how can i verify that?

like image 234
D2k Avatar asked Mar 23 '18 21:03

D2k


People also ask

How do you concatenate a string in a for loop in Java?

str1 = str2 + str2; Since Strings are immutable in java, instead of modifying str1 a new (intermediate) String object is created with the concatenated value and it is assigned to the reference str1. If you concatenate Stings in loops for each iteration a new intermediate object is created in the String constant pool.

How do you concatenate a string in a for loop?

To concatenate to a string in a for loop: Declare a variable and initialize it to an empty string. Use a for loop to iterate over the sequence. Reassign the variable to its current value plus the current item.

Is string concatenation in loop bad?

Additionally, String concatenation using the + operator within a loop should be avoided. Since the String object is immutable, each call for concatenation will result in a new String object being created.

Can you use += for string concatenation?

Concatenation is the process of combining two or more strings to form a new string by subsequently appending the next string to the end of the previous strings. In Java, two strings can be concatenated by using the + or += operator, or through the concat() method, defined in the java. lang. String class.


2 Answers

My question is: How many objects are created in this loop? Are there any intermediate objects? How can I verify that?

Spoiler:

JVM doesn't try to omit intermediate objects in the loop - so they will be created when using plain concatenation.

Let's take a look at the bytecode first. I used performance tests kindly provided by @Eugene, compiled them for java8 and then for java9. Here are those 2 methods we gonna compare:

public String concatBuilder() {
    StringBuilder sb = new StringBuilder();
    for (int i = 0; i < howmany; ++i) {
        sb.append(i);
    }
    return sb.toString();
}

public String concatPlain() {
    String result = "";
    for (int i = 0; i < howmany; ++i) {
        result = result + i;
    }
    return result;
}

My java versions are the following:

java version "1.8.0_131"
Java(TM) SE Runtime Environment (build 1.8.0_131-b11)
Java HotSpot(TM) 64-Bit Server VM (build 25.131-b11, mixed mode)

java version "9.0.4"
Java(TM) SE Runtime Environment (build 9.0.4+11)
Java HotSpot(TM) 64-Bit Server VM (build 9.0.4+11, mixed mode)

The JMH version is 1.20

Here is the output I get from javap -c LoopTest.class:

Method concatBuilder() that utilises StringBuilder explicitly looks exactly the same for java8 and java9:

public java.lang.String concatBuilder();
Code:
   0: new           #17                 // class java/lang/StringBuilder
   3: dup
   4: invokespecial #18                 // Method java/lang/StringBuilder."<init>":()V
   7: astore_1
   8: iconst_0
   9: istore_2
  10: iload_2
  11: aload_0
  12: getfield      #19                 // Field howmany:I
  15: if_icmpge     30
  18: aload_1
  19: iload_2
  20: invokevirtual #20                 // Method java/lang/StringBuilder.append:(I)Ljava/lang/StringBuilder;
  23: pop
  24: iinc          2, 1
  27: goto          10
  30: aload_1
  31: invokevirtual #21                 // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
  34: areturn

Note that the invocation of StringBuilder.append happens inside the loop, while StringBuilder.toString is called outside of it. This is important - it means that there will be no intermediate objects created. In java8 bytecode it's a bit different:

Method concatPlain() in Java8:

public java.lang.String concatPlain();
Code:
   0: ldc           #22                 // String
   2: astore_1
   3: iconst_0
   4: istore_2
   5: iload_2
   6: aload_0
   7: getfield      #19                 // Field howmany:I
  10: if_icmpge     38
  13: new           #17                 // class java/lang/StringBuilder
  16: dup
  17: invokespecial #18                 // Method java/lang/StringBuilder."<init>":()V
  20: aload_1
  21: invokevirtual #23                 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
  24: iload_2
  25: invokevirtual #20                 // Method java/lang/StringBuilder.append:(I)Ljava/lang/StringBuilder;
  28: invokevirtual #21                 // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
  31: astore_1
  32: iinc          2, 1
  35: goto          5
  38: aload_1
  39: areturn

You can see that in java8 both StringBuilder.append and StringBuilder.toString are called inside the loop statement which means that it doesn't even try to omit creation of intermediate objects! It can be described in the code below:

public String concatPlain() {
    String result = "";
    for (int i = 0; i < howmany; ++i) {
        result = result + i;
        result = new StringBuilder().append(result).append(i).toString();
    }
    return result;
}

This explains performance difference between concatPlain() and concatBuilder() (which is few thousand times(!)). The same issue happening with java9 - it doesn't try to avoid intermediate objects inside a loop, but it does a slightly better job inside a loop than java8 does (performance results are added):

Method concatPlain() Java9:

public java.lang.String concatPlain();
Code:
   0: ldc           #22                 // String
   2: astore_1
   3: iconst_0
   4: istore_2
   5: iload_2
   6: aload_0
   7: getfield      #19                 // Field howmany:I
  10: if_icmpge     27
  13: aload_1
  14: iload_2
  15: invokedynamic #23,  0             // InvokeDynamic #0:makeConcatWithConstants:(Ljava/lang/String;I)Ljava/lang/String;
  20: astore_1
  21: iinc          2, 1
  24: goto          5
  27: aload_1
  28: areturn

Here are performance results:

JAVA 8:

# Run complete. Total time: 00:02:18

Benchmark               (howmany)  Mode  Cnt     Score      Error  Units
LoopTest.concatBuilder     100000  avgt    5     2.098 ±    0.027  ms/op
LoopTest.concatPlain       100000  avgt    5  6908.737 ± 1227.681  ms/op

JAVA 9:

For java 9 there are different strategies defined with -Djava.lang.invoke.stringConcat. I tried all of them:

Default (MH_INLINE_SIZED_EXACT):

# Run complete. Total time: 00:02:30
Benchmark               (howmany)  Mode  Cnt     Score    Error  Units
LoopTest.concatBuilder     100000  avgt    5     1.625 ±  0.015  ms/op
LoopTest.concatPlain       100000  avgt    5  4812.022 ± 73.453  ms/op

-Djava.lang.invoke.stringConcat=BC_SB

# Run complete. Total time: 00:02:28
Benchmark               (howmany)  Mode  Cnt     Score    Error  Units
LoopTest.concatBuilder     100000  avgt    5     1.501 ±  0.024  ms/op
LoopTest.concatPlain       100000  avgt    5  4803.543 ± 53.825  ms/op

-Djava.lang.invoke.stringConcat=BC_SB_SIZED

# Run complete. Total time: 00:02:17
Benchmark               (howmany)  Mode  Cnt     Score     Error  Units
LoopTest.concatBuilder     100000  avgt    5     1.546 ±   0.027  ms/op
LoopTest.concatPlain       100000  avgt    5  4941.226 ± 422.704  ms/op

-Djava.lang.invoke.stringConcat=BC_SB_SIZED_EXACT

# Run complete. Total time: 00:02:45
Benchmark               (howmany)  Mode  Cnt      Score     Error  Units
LoopTest.concatBuilder     100000  avgt    5      1.560 ±   0.073  ms/op
LoopTest.concatPlain       100000  avgt    5  11390.665 ± 232.269  ms/op

-Djava.lang.invoke.stringConcat=BC_SB_SIZED_EXACT

# Run complete. Total time: 00:02:16
Benchmark               (howmany)  Mode  Cnt     Score     Error  Units
LoopTest.concatBuilder     100000  avgt    5     1.616 ±   0.030  ms/op
LoopTest.concatPlain       100000  avgt    5  8524.200 ± 219.499  ms/op

-Djava.lang.invoke.stringConcat=MH_SB_SIZED_EXACT

# Run complete. Total time: 00:02:17
Benchmark               (howmany)  Mode  Cnt     Score     Error  Units
LoopTest.concatBuilder     100000  avgt    5     1.633 ±   0.058  ms/op
LoopTest.concatPlain       100000  avgt    5  8499.228 ± 972.832  ms/op

-Djava.lang.invoke.stringConcat=MH_INLINE_SIZED_EXACT (yes, it's the default one but I decided to set it explicitly for clarity of experiment)

# Run complete. Total time: 00:02:23
Benchmark               (howmany)  Mode  Cnt     Score    Error  Units
LoopTest.concatBuilder     100000  avgt    5     1.654 ±  0.015  ms/op
LoopTest.concatPlain       100000  avgt    5  4812.231 ± 54.061  ms/op

I decided to investigate memory usage but didn't find anything interesting except that java9 consumes more memory. Attached screenshots in case anybody would be interested. Of course, they were made after the actual performance measurements, but not during them.

Java8 concatBuilder(): Java8 concatBuilder() Java8 concatPlain(): enter image description here Java9 concatBuilder(): enter image description here Java9 concatPlain(): enter image description here

So yeah, answering your question I can say that neither java8 nor java9 can avoid creating intermediate objects inside a loop.

UPDATE:

As pointed out by @Eugene naked bytecode migt be meaningless since JIT does a lot of optimizations in runtime which looks logical to me, so I decided to add the output of optimized by JIT code (captured by -XX:CompileCommand=print,*LoopTest.concatPlain).

JAVA 8:

0x00007f8c2d216d29: callq   0x7f8c2d0fdea0    ; OopMap{rsi=Oop [96]=Oop off=1550}
                                            ;*synchronization entry
                                            ; - org.sample.LoopTest::concatPlain@-1 (line 73)
                                            ;   {runtime_call}
0x00007f8c2d216d2e: jmpq    0x7f8c2d216786
0x00007f8c2d216d33: mov     %rdx,%rdx
0x00007f8c2d216d36: callq   0x7f8c2d0fa1a0    ; OopMap{r9=Oop [96]=Oop off=1563}
                                            ;*new  ; - org.sample.LoopTest::concatPlain@13 (line 75)
                                            ;   {runtime_call}
0x00007f8c2d216d3b: jmpq    0x7f8c2d2167e6
0x00007f8c2d216d40: mov     %rbx,0x8(%rsp)
0x00007f8c2d216d45: movq    $0xffffffffffffffff,(%rsp)
0x00007f8c2d216d4d: callq   0x7f8c2d0fdea0    ; OopMap{r9=Oop [96]=Oop rax=Oop off=1586}
                                            ;*synchronization entry
                                            ; - java.lang.StringBuilder::<init>@-1 (line 89)
                                            ; - org.sample.LoopTest::concatPlain@17 (line 75)
                                            ;   {runtime_call}
0x00007f8c2d216d52: jmpq    0x7f8c2d21682d
0x00007f8c2d216d57: mov     %rbx,0x8(%rsp)
0x00007f8c2d216d5c: movq    $0xffffffffffffffff,(%rsp)
0x00007f8c2d216d64: callq   0x7f8c2d0fdea0    ; OopMap{r9=Oop [96]=Oop rax=Oop off=1609}
                                            ;*synchronization entry
                                            ; - java.lang.AbstractStringBuilder::<init>@-1 (line 67)
                                            ; - java.lang.StringBuilder::<init>@3 (line 89)
                                            ; - org.sample.LoopTest::concatPlain@17 (line 75)
                                            ;   {runtime_call}
0x00007f8c2d216d69: jmpq    0x7f8c2d216874
0x00007f8c2d216d6e: mov     %rbx,0x8(%rsp)
0x00007f8c2d216d73: movq    $0xffffffffffffffff,(%rsp)
0x00007f8c2d216d7b: callq   0x7f8c2d0fdea0    ; OopMap{r9=Oop [96]=Oop rax=Oop off=1632}
                                            ;*synchronization entry
                                            ; - java.lang.Object::<init>@-1 (line 37)
                                            ; - java.lang.AbstractStringBuilder::<init>@1 (line 67)
                                            ; - java.lang.StringBuilder::<init>@3 (line 89)
                                            ; - org.sample.LoopTest::concatPlain@17 (line 75)
                                            ;   {runtime_call}
0x00007f8c2d216d80: jmpq    0x7f8c2d2168bb
0x00007f8c2d216d85: callq   0x7f8c2d0faa60    ; OopMap{r9=Oop [96]=Oop r13=Oop off=1642}
                                            ;*newarray
                                            ; - java.lang.AbstractStringBuilder::<init>@6 (line 68)
                                            ; - java.lang.StringBuilder::<init>@3 (line 89)
                                            ; - org.sample.LoopTest::concatPlain@17 (line 75)
                                            ;   {runtime_call}
0x00007f8c2d216d8a: jmpq    0x7f8c2d21693a
0x00007f8c2d216d8f: mov     %rdx,0x8(%rsp)
0x00007f8c2d216d94: movq    $0xffffffffffffffff,(%rsp)
0x00007f8c2d216d9c: callq   0x7f8c2d0fdea0    ; OopMap{r9=Oop [96]=Oop r13=Oop off=1665}
                                            ;*synchronization entry
                                            ; - java.lang.StringBuilder::append@-1 (line 136)
                                            ; - org.sample.LoopTest::concatPlain@21 (line 75)
                                            ;   {runtime_call}
0x00007f8c2d216da1: jmpq    0x7f8c2d216a1c
0x00007f8c2d216da6: mov     %rdx,0x8(%rsp)
0x00007f8c2d216dab: movq    $0xffffffffffffffff,(%rsp)
0x00007f8c2d216db3: callq   0x7f8c2d0fdea0    ; OopMap{[80]=Oop [96]=Oop off=1688}
                                            ;*synchronization entry
                                            ; - java.lang.StringBuilder::append@-1 (line 208)
                                            ; - org.sample.LoopTest::concatPlain@25 (line 75)
                                            ;   {runtime_call}
0x00007f8c2d216db8: jmpq    0x7f8c2d216b08
0x00007f8c2d216dbd: mov     %rdx,0x8(%rsp)
0x00007f8c2d216dc2: movq    $0xffffffffffffffff,(%rsp)
0x00007f8c2d216dca: callq   0x7f8c2d0fdea0    ; OopMap{[80]=Oop [96]=Oop off=1711}
                                            ;*synchronization entry
                                            ; - java.lang.StringBuilder::toString@-1 (line 407)
                                            ; - org.sample.LoopTest::concatPlain@28 (line 75)
                                            ;   {runtime_call}
0x00007f8c2d216dcf: jmpq    0x7f8c2d216bf8
0x00007f8c2d216dd4: mov     %rdx,%rdx
0x00007f8c2d216dd7: callq   0x7f8c2d0fa1a0    ; OopMap{[80]=Oop [96]=Oop off=1724}
                                            ;*new  ; - java.lang.StringBuilder::toString@0 (line 407)
                                            ; - org.sample.LoopTest::concatPlain@28 (line 75)
                                            ;   {runtime_call}
0x00007f8c2d216ddc: jmpq    0x7f8c2d216c39
0x00007f8c2d216de1: mov     %rax,0x8(%rsp)
0x00007f8c2d216de6: movq    $0x23,(%rsp)
0x00007f8c2d216dee: callq   0x7f8c2d0fdea0    ; OopMap{[96]=Oop [104]=Oop off=1747}
                                            ;*goto
                                            ; - org.sample.LoopTest::concatPlain@35 (line 74)
                                            ;   {runtime_call}
0x00007f8c2d216df3: jmpq    0x7f8c2d216cae

As you can see StringBuilder::toString is invoked before the goto which means that everything is happening inside the loop. Similar situation with java9 - StringConcatHelper::newString is invoked before the goto command.

JAVA 9:

0x00007fa1256548a4: mov     %ebx,%r13d
0x00007fa1256548a7: sub     0xc(%rsp),%r13d   ;*isub {reexecute=0 rethrow=0 return_oop=0}
                                            ; - java.lang.StringConcatHelper::prepend@5 (line 329)
                                            ; - java.lang.invoke.DirectMethodHandle$Holder::invokeStatic@16
                                            ; - java.lang.invoke.LambdaForm$BMH/127835623::reinvoke@172
                                            ; - java.lang.invoke.LambdaForm$MH/1587176117::linkToTargetMethod@6
                                            ; - org.sample.LoopTest::concatPlain@15 (line 75)

0x00007fa1256548ac: test    %r13d,%r13d
0x00007fa1256548af: jl      0x7fa125654b11
0x00007fa1256548b5: mov     %r13d,%r10d
0x00007fa1256548b8: add     %r9d,%r10d
0x00007fa1256548bb: mov     0x20(%rsp),%r11d
0x00007fa1256548c0: cmp     %r10d,%r11d
0x00007fa1256548c3: jb      0x7fa125654b11    ;*invokestatic arraycopy {reexecute=0 rethrow=0 return_oop=0}
                                            ; - java.lang.String::getBytes@22 (line 2993)
                                            ; - java.lang.StringConcatHelper::prepend@11 (line 330)
                                            ; - java.lang.invoke.DirectMethodHandle$Holder::invokeStatic@16
                                            ; - java.lang.invoke.LambdaForm$BMH/127835623::reinvoke@172
                                            ; - java.lang.invoke.LambdaForm$MH/1587176117::linkToTargetMethod@6
                                            ; - org.sample.LoopTest::concatPlain@15 (line 75)

0x00007fa1256548c9: test    %r9d,%r9d
0x00007fa1256548cc: jbe     0x7fa1256548ef
0x00007fa1256548ce: movsxd  %r9d,%rdx
0x00007fa1256548d1: lea     (%r12,%r8,8),%r10  ;*getfield value {reexecute=0 rethrow=0 return_oop=0}
                                            ; - java.lang.String::length@1 (line 669)
                                            ; - java.lang.StringConcatHelper::mixLen@2 (line 116)
                                            ; - java.lang.invoke.DirectMethodHandle$Holder::invokeStatic@11
                                            ; - java.lang.invoke.LambdaForm$BMH/127835623::reinvoke@105
                                            ; - java.lang.invoke.LambdaForm$MH/1587176117::linkToTargetMethod@6
                                            ; - org.sample.LoopTest::concatPlain@15 (line 75)

0x00007fa1256548d5: lea     0x10(%r12,%r8,8),%rdi
0x00007fa1256548da: mov     %rcx,%r10
0x00007fa1256548dd: lea     0x10(%rcx,%r13),%rsi
0x00007fa1256548e2: movabs  $0x7fa11db9d640,%r10
0x00007fa1256548ec: callq   %r10              ;*invokestatic arraycopy {reexecute=0 rethrow=0 return_oop=0}
                                            ; - java.lang.String::getBytes@22 (line 2993)
                                            ; - java.lang.StringConcatHelper::prepend@11 (line 330)
                                            ; - java.lang.invoke.DirectMethodHandle$Holder::invokeStatic@16
                                            ; - java.lang.invoke.LambdaForm$BMH/127835623::reinvoke@172
                                            ; - java.lang.invoke.LambdaForm$MH/1587176117::linkToTargetMethod@6
                                            ; - org.sample.LoopTest::concatPlain@15 (line 75)

0x00007fa1256548ef: cmp     0xc(%rsp),%ebx
0x00007fa1256548f3: jne     0x7fa125654cb9    ;*ifeq {reexecute=0 rethrow=0 return_oop=0}
                                            ; - java.lang.StringConcatHelper::newString@1 (line 343)
                                            ; - java.lang.invoke.DirectMethodHandle$Holder::invokeStatic@14
                                            ; - java.lang.invoke.LambdaForm$BMH/127835623::reinvoke@194
                                            ; - java.lang.invoke.LambdaForm$MH/1587176117::linkToTargetMethod@6
                                            ; - org.sample.LoopTest::concatPlain@15 (line 75)

0x00007fa1256548f9: mov     0x60(%r15),%rax
0x00007fa1256548fd: mov     %rax,%r10
0x00007fa125654900: add     $0x18,%r10
0x00007fa125654904: cmp     0x70(%r15),%r10
0x00007fa125654908: jnb     0x7fa125654aa5
0x00007fa12565490e: mov     %r10,0x60(%r15)
0x00007fa125654912: prefetchnta 0x100(%r10)
0x00007fa12565491a: mov     0x18(%rsp),%rsi
0x00007fa12565491f: mov     0xb0(%rsi),%r10
0x00007fa125654926: mov     %r10,(%rax)
0x00007fa125654929: movl    $0xf80002da,0x8(%rax)  ;   {metadata('java/lang/String')}
0x00007fa125654930: mov     %r12d,0xc(%rax)
0x00007fa125654934: mov     %r12,0x10(%rax)   ;*new {reexecute=0 rethrow=0 return_oop=0}
                                            ; - java.lang.StringConcatHelper::newString@36 (line 346)
                                            ; - java.lang.invoke.DirectMethodHandle$Holder::invokeStatic@14
                                            ; - java.lang.invoke.LambdaForm$BMH/127835623::reinvoke@194
                                            ; - java.lang.invoke.LambdaForm$MH/1587176117::linkToTargetMethod@6
                                            ; - org.sample.LoopTest::concatPlain@15 (line 75)

0x00007fa125654938: mov     0x30(%rsp),%r10
0x00007fa12565493d: shr     $0x3,%r10
0x00007fa125654941: mov     %r10d,0xc(%rax)   ;*synchronization entry
                                            ; - java.lang.StringConcatHelper::newString@-1 (line 343)
                                            ; - java.lang.invoke.DirectMethodHandle$Holder::invokeStatic@14
                                            ; - java.lang.invoke.LambdaForm$BMH/127835623::reinvoke@194
                                            ; - java.lang.invoke.LambdaForm$MH/1587176117::linkToTargetMethod@6
                                            ; - org.sample.LoopTest::concatPlain@15 (line 75)

0x00007fa125654945: mov     0x8(%rsp),%ebx
0x00007fa125654949: incl    %ebx              ; ImmutableOopMap{rax=Oop [0]=Oop }
                                            ;*goto {reexecute=1 rethrow=0 return_oop=0}
                                            ; - org.sample.LoopTest::concatPlain@24 (line 74)

0x00007fa12565494b: test    %eax,0x1a8996af(%rip)  ;*goto {reexecute=0 rethrow=0 return_oop=0}
                                            ; - org.sample.LoopTest::concatPlain@24 (line 74)
                                            ;   {poll}
like image 69
Danylo Zatorsky Avatar answered Oct 01 '22 05:10

Danylo Zatorsky


For the record, here is a JMH test...

@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.MILLISECONDS)
@Warmup(iterations = 5, time = 5, timeUnit = TimeUnit.SECONDS)
@Measurement(iterations = 5, time = 5, timeUnit = TimeUnit.SECONDS)
@State(Scope.Thread)
public class LoopTest {

    public static void main(String[] args) throws RunnerException {
        Options opt = new OptionsBuilder().include(LoopTest.class.getSimpleName())
                .jvmArgs("-ea", "-Xms10000m", "-Xmx10000m")
                .shouldFailOnError(true)
                .build();
        new Runner(opt).run();
    }

    @Param(value = {"1000", "10000", "100000"})
    int howmany;

    @Fork(1)
    @Benchmark
    public String concatBuilder(){
        StringBuilder sb = new StringBuilder();
        for(int i=0;i<howmany;++i){
            sb.append(i);
        }
        return sb.toString();
    }

    @Fork(1)
    @Benchmark
    public String concatPlain(){
        String result = "";
        for(int i=0;i<howmany;++i){
            result +=i;
        }
        return result;
    }
}

Produces result (only for 100000 shown here) that I did not really expect:

LoopTest.concatPlain       100000  avgt    5  3902.711 ± 67.215  ms/op
LoopTest.concatBuilder     100000  avgt    5     1.850 ±  0.574  ms/op
like image 43
Eugene Avatar answered Oct 01 '22 06:10

Eugene