Logo Questions Linux Laravel Mysql Ubuntu Git Menu

How can I improve performance, if append() is called on StringBuffer (or StringBuilder) consecutively without reusing the target variable

I have the following piece of code in Java.

String foo = " ";

Method 1:

StringBuffer buf = new StringBuffer();

Method 2:

StringBuffer buf = new StringBuffer();

Can someone enlighten me, how method 2 can improve the performance of code?


like image 604
manoj Avatar asked Jun 07 '16 07:06


People also ask

Is called consecutively without reusing the target variable?

append is called consecutively without reusing the target variable. StringBuffer (or StringBuilder). append is called 2 consecutive times with literal Strings. Use a single append with a single combined String.

How many times is the StringBuilder append method overloaded?

There are 13 various overloaded append() methods in both StringBuffer and StringBuilder classes.

Which one would you prefer string builder or '+' while concatenating string and why?

Since Java 1.5, simple one line concatenation with "+" and StringBuilder. append() generate exactly the same bytecode. So for the sake of code readability, use "+".

What is the used of append () in string builder?

append() method is used to append the string representation of some argument to the sequence.

1 Answers

Is it really different?

Let's start by analyzing javac output. Given the code:

public class Main {
  public String appendInline() {
    final StringBuilder sb = new StringBuilder().append("some").append(' ').append("string");
    return sb.toString();

  public String appendPerLine() {
    final StringBuilder sb = new StringBuilder();
    sb.append(' ');
    return sb.toString();

We compile with javac, and check the output with javap -c -s

  public java.lang.String appendInline();
    descriptor: ()Ljava/lang/String;
       0: new           #2                  // class java/lang/StringBuilder
       3: dup
       4: invokespecial #3                  // Method java/lang/StringBuilder."<init>":()V
       7: ldc           #4                  // String some
       9: invokevirtual #5                  // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
      12: bipush        32
      14: invokevirtual #6                  // Method java/lang/StringBuilder.append:(C)Ljava/lang/StringBuilder;
      17: ldc           #7                  // String string
      19: invokevirtual #5                  // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
      22: astore_1
      23: aload_1
      24: invokevirtual #8                  // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
      27: areturn

  public java.lang.String appendPerLine();
    descriptor: ()Ljava/lang/String;
       0: new           #2                  // class java/lang/StringBuilder
       3: dup
       4: invokespecial #3                  // Method java/lang/StringBuilder."<init>":()V
       7: astore_1
       8: aload_1
       9: ldc           #4                  // String some
      11: invokevirtual #5                  // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
      14: pop
      15: aload_1
      16: bipush        32
      18: invokevirtual #6                  // Method java/lang/StringBuilder.append:(C)Ljava/lang/StringBuilder;
      21: pop
      22: aload_1
      23: ldc           #7                  // String string
      25: invokevirtual #5                  // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
      28: pop
      29: aload_1
      30: invokevirtual #8                  // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
      33: areturn

As seen, the appendPerLine variant produces a much larger bytecode, by producing several extra aload_1 and pop instructions that basically cancel each other out (leaving the string builder / buffer in he stack, and removing it to discard it). In turn, this means the JRE will produce a larger callsite and has a greater overhead. On the contrary, a smaller callsite improves the chances the JVM will inline the method calls, reducing method call overhead and further improving performance.

This alone improves the performance from a cold start when chaining method calls.

Shouldn't the JVM optimize this away?

One could argue that the JRE should be able to optimize these instructions away once the VM has warmed up. However, this claim needs support, and would still only apply to long-running processes.

So, let's check this claim, and validate the performance even after warmup. Let's use JMH to benchmark this behavior:

import org.openjdk.jmh.annotations.Benchmark;
import org.openjdk.jmh.annotations.Param;
import org.openjdk.jmh.annotations.Scope;
import org.openjdk.jmh.annotations.State;

public class StringBenchmark {
    private String from = "Alex";
    private String to = "Readers";
    private String subject = "Benchmarking with JMH";

    private int size;

    public String testEmailBuilderSimple() {
        StringBuilder builder = new StringBuilder(size);
        return builder.toString();

    public String testEmailBufferSimple() {
        StringBuffer buffer = new StringBuffer(size);
        return buffer.toString();

    public String testEmailBuilderChain() {
        return new StringBuilder(size).append("From").append(from).append("To").append(to).append("Subject")

    public String testEmailBufferChain() {
        return new StringBuffer(size).append("From").append(from).append("To").append(to).append("Subject")

We compile and run it and we obtain:

Benchmark                               (size)   Mode  Cnt         Score        Error  Units
StringBenchmark.testEmailBufferChain        16  thrpt  200  22981842.957 ± 238502.907  ops/s
StringBenchmark.testEmailBufferSimple       16  thrpt  200   5789967.103 ±  62743.660  ops/s
StringBenchmark.testEmailBuilderChain       16  thrpt  200  22984472.260 ± 212243.175  ops/s
StringBenchmark.testEmailBuilderSimple      16  thrpt  200   5778824.788 ±  59200.312  ops/s

So, even after warming up, following the rule produces a ~4X improvement in throughput. All these runs were done using Oracle JRE 8u121.

Of course, you don't have to believe me, others have done similar analysis and you can even try it yourself.

Does it even matter?

Well, it depends. This is certainly a micro-optimization. If a system is using Bubble Sort, there are certainly more pressing performance issues than this. Not all programs have the same requirements and therefore not all need to follow the same rules.

This PMD rule is probably meaningful only to specific projects that value performance greatly, and will do whatever it takes to shave a couple ms. Such projects would normally use several different profilers, microbenchmarks, and other tools. And having tools such as PMD keeping an eye on specific patterns will certainly help them.

PMD has many other rules available, that will probably apply to many other projects. Just because this particular rule may not apply to your project doesn't mean the tool is not useful, just take your time to review the available rules and choose those that really matter to you.

Hope that clears it up for everyone.

like image 150
Johnco Avatar answered Nov 03 '22 00:11
