I was attempting to create a faster version of String.equals() method and started by simply copying it. The result I found was quite confusing. When I ran the copy pasted version, timed it and compared it against the JVM one, the JVM version was faster. The difference ranged from 6x to 34x faster! Simply put, the longer the string, larger is the difference.
boolean equals(final char a[], final char b[]) {
int n = a.length;
int i = 0;
while (n-- != 0) {
if (a[i] != b[i]) return false;
i++;
}
return true;
}
public static void main() throws Exception {
String a = "blah balh balh";
String b = "blah balh balb";
long me = 0, jvm = 0;
Field value = String.class.getDeclaredField("value");
value.setAccessible(true);
final char lhs[] = (char[]) value.get(a);
final char rhs[] = (char[]) value.get(b);
for (int i = 0; i < 100; i++) {
long t = System.nanoTime();
equals(lhs, rhs);
t = System.nanoTime() - t;
me += t;
}
for (int i = 0; i < 100; i++) {
long t = System.nanoTime();
a.equals(b);
t = System.nanoTime() - t;
jvm += t;
}
System.out.println("me = " + me);
System.out.println("jvm = " + jvm);
}
Output:
me = 258931
jvm = 14991
The equals() method I wrote is a copy-pasted version of the one found in String.equals() method. Why is the JVM version faster than its copy-pasted version? Isn't it effectively the same?
Could someone explain why I see such visible differences?
PS: If you wish to see large differences, you could create long (really, really long) strings with just one character differing at the end.
Specifically with regard to strings, yes, == is slightly faster than equals , because the first thing the String. equals method does is...a == comparison to see if the string is being compared to itself. If it is, equals() is slower by the cost of a method call.
equals is faster because it only needs to compare the strings starting from the beginning of both strings, whereas contains must loop over s to check whether "Alexander" begins and ends anywhere within s .
equals() method in Java. Both equals() method and the == operator are used to compare two objects in Java. == is an operator and equals() is method. But == operator compares reference or memory location of objects in a heap, whether they point to the same location or not.
Why is the JVM version faster than it's copy-pasted version. Isn't it effectively the same?
Surprisingly, it isn't.
String comparison is such an ubiquitous operation that it is almost certainly the case that your JIT compiler has an intrinsic for String.equals()
. This means that the compiler knows how to generate specially-crafted machine code for comparing strings. This is done transparently to you, the programmer, when you use String.equals()
.
This would explain why String.equals()
is so much faster than your method, even if superficially they appear identical.
A quick search finds several bug reports that mention such an intrinsic in HotSpot. For example, 7041100 : The load in String.equals intrinsic executed before null check.
The relevant HotSpot source can be found here. The functions in question are:
848 Node* LibraryCallKit::make_string_method_node(int opcode, Node* str1, Node* cnt1, Node* str2, Node* cnt2) {
and
943 bool LibraryCallKit::inline_string_equals() {
Hotspot allows developers to provide a native implementation of a method in addition of the Java implementation. The Java code is swapped out at runtime and replaced by the optimized version. It is called an intrinsic. Few hundred of methods from base classes are optimized by intrinsics.
By looking at the OpenJDK source code you can see the x86_64 implementation of String.equals. You can also look into vmSymbols to get the list of all instrinsics (search for do_intrinsic
)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With