I think String.indexOf(char)
is a little more faster than
String.indexOf(String)
when using single character & single String(ex, 'x' & "x")
To make sure my guessing, I wrote easy test code like below.
public static void main(String[] args) {
IndexOfTest test = new IndexOfTest(Integer.parseInt(args[0]));
test.run();
}
public IndexOfTest(int loop) {
this.loop = loop;
}
public void run() {
long start, end;
start = System.currentTimeMillis();
for(int i = 0 ; i < loop ; i++) {
alphabet.indexOf("x");
}
end = System.currentTimeMillis();
System.out.println("indexOf(String) : " + (end - start) + "ms");
start = System.currentTimeMillis();
for(int i = 0 ; i < loop ; i++) {
alphabet.indexOf('x');
}
end = System.currentTimeMillis();
System.out.println("indexOf(char) : " + (end - start) + "ms");
}
alphabet is String variable that has "abcd...xyzABCD...XYZ".
from this code, I got result table like this...
loop 10^3 10^4 10^5 10^6 10^7
String 1 7 8 9 9
char 1 2 5 10 64
String.indexOf(String) looks like converge to 9ms, however String.indexOf(char) increases exponentially.
I'm very confused. Is there any optimization for using String in this case? Or how I figure out this result?
I ran jmh with below two benchmark method. Each method calls a indexOf method.
@State(Scope.Thread)
public class MyBenchmark {
private String alphabet = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ";
@Benchmark
public void indexOfString() {
alphabet.indexOf("x");
}
@Benchmark
public void indexOfChar() {
alphabet.indexOf('x');
}
}
result:
Benchmark Mode Cnt Score Error Units
MyBenchmark.indexOfChar thrpt 30 142106399.525 ± 51360.808 ops/s
MyBenchmark.indexOfString thrpt 30 2178872840.575 ± 864573.421 ops/s
This result also show indexOf(String) is faster..
I think that it is time to think about hidden optimization
Any idea?
IndexOf(string) has no options and Contains() uses an Ordinal compare (a byte-by-byte comparison rather than trying to perform a smart compare, for example, e with é). So IndexOf will be marginally faster (in theory) as IndexOf goes straight to a string search using FindNLSString from kernel32.
The indexOf() and lastIndexOf() function return a numeric index that indicates the starting position of a given substring in the specified metadata string: indexOf() returns the index for the first occurrence of the substring. lastIndexOf() returns the index for the last occurrence of the substring.
There are four variants of indexOf() method.
The indexOf() method returns the position of the first occurrence of a value in a string. The indexOf() method returns -1 if the value is not found. The indexOf() method is case sensitive.
Your JMH test is incorrect as you don't consume the result, so the indexOf
call can be (or can be not) removed at all by JIT compiler. In your case it seems that JIT-compiler determined that indexOf(String)
has no side-effect and removed this call at all, but did not do the same for indexOf(char)
. Always consume the result (the simplest way is to return it from the benchmark). Here's my version:
import java.util.*;
import java.util.concurrent.TimeUnit;
import org.openjdk.jmh.annotations.*;
@State(Scope.Benchmark)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@Warmup(iterations = 5, time = 500, timeUnit = TimeUnit.MILLISECONDS)
@Measurement(iterations = 10, time = 500, timeUnit = TimeUnit.MILLISECONDS)
@Fork(3)
public class IndexOfTest {
private String str;
private char c;
private String s;
@Setup
public void setup() {
str = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz";
c = 'z';
s = "z";
}
@Benchmark
public int indexOfChar() {
return str.indexOf('z');
}
@Benchmark
public int indexOfString() {
return str.indexOf("z");
}
@Benchmark
public int indexOfCharIndirect() {
return str.indexOf(c);
}
@Benchmark
public int indexOfStringIndirect() {
return str.indexOf(s);
}
}
I test the same thing, but added two indirect tests: when searched char or String is loaded from the field, thus its exact value is unknown during the JIT-compilation. The results are the following (Intel x64):
# JMH 1.11.2 (released 27 days ago)
# VM version: JDK 1.8.0_45, VM 25.45-b02
Benchmark Mode Cnt Score Error Units
IndexOfTest.indexOfChar avgt 30 25,364 ± 0,424 ns/op
IndexOfTest.indexOfCharIndirect avgt 30 25,287 ± 0,210 ns/op
IndexOfTest.indexOfString avgt 30 24,370 ± 0,100 ns/op
IndexOfTest.indexOfStringIndirect avgt 30 27,198 ± 0,048 ns/op
As you can see, indexOfChar
performs in the same way regardless of direct or indirect access. The indexOfString
is slightly faster for direct access, but somewhat slower for indirect. That's because indexOf(String)
is a JVM intrinsic: its Java code is actually replaced by JIT compiler with efficient inline implementation. For constant string known at JIT compilation time it's possible to generate more efficient code.
In general there's no big difference at least for such short strings. Thus you may use either of these methods for single symbol match.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With