Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why do StringBuilders pop up when debugging String concatenation?

I am aware that String are immutable, and on when to use a StringBuilder or a StringBuffer. I also read that the bytecode for these two snippets would end up being the same:

//Snippet 1
String variable = "text";
this.class.getResourceAsStream("string"+variable);

//Snippet 2
StringBuilder sb = new StringBuilder("string");
sb.append("text");
this.class.getResourceAsStream(sb.toString());

But I obviously have something wrong. When debugging through Snippet 1 in eclipse, I am actually taken to the StringBuilder constructor and to the append method. I suppose I'm missing details on how bytecode is interpreted and how the debugger refers back to the lines in the source code; if anyone could explain this a bit, I'd really appreciate it. Also, maybe you can point out what's JVM specific and what isn't (I'm for example running Oracle's v6), Thanks!

like image 436
Miquel Avatar asked Jan 15 '23 21:01

Miquel


2 Answers

Why do StringBuilders pop up when debugging String concatenation?

Because string concatenation (via the '+' operator) is typically compiled to code that uses a StringBuffer or StringBuilder to do the concatenation. The JLS explicitly permits this behaviour.

"An implementation may choose to perform conversion and concatenation in one step to avoid creating and then discarding an intermediate String object. To increase the performance of repeated string concatenation, a Java compiler may use the StringBuffer class or a similar technique to reduce the number of intermediate String objects that are created by evaluation of an expression." JLS 15.18.1.

(If your code is using a StringBuffer rather than a StringBuilder, it is probably because it was compiled using a really old Java compiler, or because you have specified a really old target JVM. The StringBuilder class is a relatively addition to Java. Older versions of the JLS used to mention StringBuffer instead of StringBuilder, IIRC.)


Also, maybe you can point out what's JVM specific and what isn't.

The bytecodes produced for "string" + variable" depend on how the Java compiler handles the concatenation. (In fact, all generated bytecodes are Java compiler dependent to some degree. The JLS and JVM specs do not dictate what bytecodes must be generated. The specifications are more about how the program should behave, and what individual bytecodes do.)


@supercat comments:

I wonder why string concatenation wouldn't use e.g. a String constructor overload which accepts two String objects, allocates a buffer of the proper combined size, and joins them? Or, when joining more strings, an overload which takes a String[]? Creating a String[] containing references to the strings to be joined should be no more expensive than creating a StringBuilder, and being able to create a perfect-sized backing store in one shot should be an easy performance win.

Maybe ... but I'd say probably not. This is a complicated area involving complicated trade-offs. The chosen implementation strategy for string concatenation has to work well across a wide range of different use-cases.

My understanding is that the original strategy was chosen after looking at a number of approaches, and doing some large-scale static code analysis and benchmarking to try to figure out which approach was best. I imagine they considered all of the alternatives that you proposed. (After all, they were / are smart people ...)

Having said that, the complete source code base for Java 6, 7 and 8 are available to you. That means that you could download it, and try some experiments of your own to see if your theories are right. If they are ... and you can gather solid evidence that they are ... then submit a patch to the OpenJDK team.

like image 57
Stephen C Avatar answered Mar 14 '23 14:03

Stephen C


@StephenC I am still not convinced with the explanation. The compiler may do whatever optimization it wants to do but when you debug through the eclipse the source code view is hidden from compiler code and it should not jump one section of code to another code within the same source file.

The following description in the question suggests that the source code and byte code are not in sync. i.e., he is not running the latest code.

When debugging through Snippet 1 in eclipse, I am actually taken to the StringBuffer constructor and to the append method

and how the debugger refers back to the lines in the source code

like image 45
srikanth yaradla Avatar answered Mar 14 '23 15:03

srikanth yaradla