The JVM optimzes String concatenation with +
and replaces it with a StringBuilder
. This should be the same in Scala. But what happens if strings are concatenated with ++=
?
var x = "x"
x ++= "y"
x ++= "z"
As far as I know this methods treats strings like char seqences, so even if the JVM would create a StringBuilder
it would lead to many method calls, right? Would it be better to use a StringBuilder instead?
To what type is the String converted implicitly?
Concatenating strings in Scala Scala provides concat() method to concatenate two strings, this method returns a new string which is created using two strings. You can also use '+' operator to concatenate two strings.
Doing N concatenations requires creating N new strings in the process. join() , on the other hand, only has to create a single string (the final result) and thus works much faster.
Note that regular string concatenations are faster than using the StringBuilder but only when you're using a few of them at a time. If you are using two or three string concatenations, use a string.
It is always better to use StringBuilder. append to concatenate two string values.
There is a huge HUGE difference in time taken.
If you add strings repeatedly using +=
you do not optimize away the O(n^2)
cost of creating incrementally longer strings. So for adding one or two you won't see a difference, but it doesn't scale; by the time you get to adding 100 (short) strings, using a StringBuilder is over 20x faster. (Precise data: 1.3 us vs. 27.1 us to add the string representations of the numbers 0 to 100; timings should be reproducible to about += 5% and of course are for my machine.)
Using ++=
on a var
String
is far far worse yet, because you are then instructing Scala to treat a string as a character-by-character collection which then requires all sorts of wrappers to make the String look like a collection (including boxed character-by-character addition using the generic version of ++
!). Now you're 16x slower again on 100 additions! (Precise data: 428.8 us for ++=
on a var string instead of +=
's 26.7 us.)
If you write a single statement with a bunch of +
es then the Scala compiler will use a StringBuilder and end up with an efficient result (Data: 1.8 us on non-constant strings pulled out of an array).
So, if you add strings with anything other than +
in line, and you care about efficiency, use a StringBuilder
. Definitely don't use ++=
to add another String
to a var
String
; there just isn't any reason to do it, and there's a big runtime penalty.
(Note--very often you don't care at all how efficient your string additions are! Don't clutter your code with extra StringBuilder
s unless you have reason to suspect that this particular code path is getting called a lot.)
Actually, the inconvenient truth is StringOps
usually remains an allocation:
scala> :pa
// Entering paste mode (ctrl-D to finish)
class Concat {
var x = "x"
x ++= "y"
x ++= "z"
}
// Exiting paste mode, now interpreting.
defined class Concat
scala> :javap -prv Concat
Binary file Concat contains $line3.$read$$iw$$iw$Concat
Size 1211 bytes
MD5 checksum 1900522728cbb0ed0b1d3f8b962667ad
Compiled from "<console>"
public class $line3.$read$$iw$$iw$Concat
SourceFile: "<console>"
[snip]
public $line3.$read$$iw$$iw$Concat();
descriptor: ()V
flags: ACC_PUBLIC
Code:
stack=6, locals=1, args_size=1
0: aload_0
1: invokespecial #19 // Method java/lang/Object."<init>":()V
4: aload_0
5: ldc #20 // String x
7: putfield #10 // Field x:Ljava/lang/String;
10: aload_0
11: new #22 // class scala/collection/immutable/StringOps
14: dup
15: getstatic #28 // Field scala/Predef$.MODULE$:Lscala/Predef$;
18: aload_0
19: invokevirtual #30 // Method x:()Ljava/lang/String;
22: invokevirtual #34 // Method scala/Predef$.augmentString:(Ljava/lang/String;)Ljava/lang/String;
25: invokespecial #36 // Method scala/collection/immutable/StringOps."<init>":(Ljava/lang/String;)V
28: new #22 // class scala/collection/immutable/StringOps
31: dup
32: getstatic #28 // Field scala/Predef$.MODULE$:Lscala/Predef$;
35: ldc #38 // String y
37: invokevirtual #34 // Method scala/Predef$.augmentString:(Ljava/lang/String;)Ljava/lang/String;
40: invokespecial #36 // Method scala/collection/immutable/StringOps."<init>":(Ljava/lang/String;)V
43: getstatic #28 // Field scala/Predef$.MODULE$:Lscala/Predef$;
46: invokevirtual #42 // Method scala/Predef$.StringCanBuildFrom:()Lscala/collection/generic/CanBuildFrom;
49: invokevirtual #46 // Method scala/collection/immutable/StringOps.$plus$plus:(Lscala/collection/GenTraversableOnce;Lscala/collection/generic/CanBuildFrom;)Ljava/lang/Object;
52: checkcast #48 // class java/lang/String
55: invokevirtual #50 // Method x_$eq:(Ljava/lang/String;)V
See more demonstration at this answer.
Edit: To say more, you're building the String on each reassignment, so, no you're not using a single StringBuilder
.
However, the optimization is done by javac
and not the JIT compiler, so to compare fruits of the same kind:
public class Strcat {
public String strcat(String s) {
String t = " hi ";
String u = " by ";
return s + t + u; // OK
}
public String strcat2(String s) {
String t = s + " hi ";
String u = t + " by ";
return u; // bad
}
}
whereas
$ scala
Welcome to Scala version 2.11.2 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_11).
Type in expressions to have them evaluated.
Type :help for more information.
scala> :se -Xprint:typer
scala> class K { def f(s: String, t: String, u: String) = s ++ t ++ u }
[[syntax trees at end of typer]] // <console>
def f(s: String, t: String, u: String): String = scala.this.Predef.augmentString(scala.this.Predef.augmentString(s).++[Char, String](scala.this.Predef.augmentString(t))(scala.this.Predef.StringCanBuildFrom)).++[Char, String](scala.this.Predef.augmentString(u))(scala.this.Predef.StringCanBuildFrom)
is bad. Or, worse, to unroll Rex's explanation:
"abc" ++ "def"
augmentString("abc").++[Char, String](augmentString("def"))(StringCanBuildFrom)
collection.mutable.StringBuilder.newBuilder ++= new WrappedString(augmentString("def"))
val b = collection.mutable.StringBuilder.newBuilder
new WrappedString(augmentString("def")) foreach b.+=
As Rex explained, StringBuilder
overrides ++=(String)
but not Growable.++=(Traversable[Char])
.
In case you've ever wondered what unaugmentString
is for:
28: invokevirtual #40 // Method scala/Predef$.augmentString:(Ljava/lang/String;)Ljava/lang/String;
31: invokevirtual #43 // Method scala/Predef$.unaugmentString:(Ljava/lang/String;)Ljava/lang/String;
34: invokespecial #46 // Method scala/collection/immutable/WrappedString."<init>":(Ljava/lang/String;)V
And just to show that you do finally call unadorned +=(Char)
but after boxing and unboxing:
public final scala.collection.mutable.StringBuilder apply(char);
flags: ACC_PUBLIC, ACC_FINAL
Code:
stack=2, locals=2, args_size=2
0: aload_0
1: getfield #19 // Field b$1:Lscala/collection/mutable/StringBuilder;
4: iload_1
5: invokevirtual #24 // Method scala/collection/mutable/StringBuilder.$plus$eq:(C)Lscala/collection/mutable/StringBuilder;
8: areturn
LocalVariableTable:
Start Length Slot Name Signature
0 9 0 this L$line10/$read$$iw$$iw$$anonfun$1;
0 9 1 x C
LineNumberTable:
line 9: 0
public final java.lang.Object apply(java.lang.Object);
flags: ACC_PUBLIC, ACC_FINAL, ACC_BRIDGE, ACC_SYNTHETIC
Code:
stack=2, locals=2, args_size=2
0: aload_0
1: aload_1
2: invokestatic #35 // Method scala/runtime/BoxesRunTime.unboxToChar:(Ljava/lang/Object;)C
5: invokevirtual #37 // Method apply:(C)Lscala/collection/mutable/StringBuilder;
8: areturn
LocalVariableTable:
Start Length Slot Name Signature
0 9 0 this L$line10/$read$$iw$$iw$$anonfun$1;
0 9 1 v1 Ljava/lang/Object;
LineNumberTable:
line 9: 0
A good laugh does get some oxygen into the bloodstream.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With