UPDATE: It looks like my 13.0.1 is calling code from this diffed CharMatcher.
http://code.google.com/p/guava-libraries/source/diff?spec=svn69ad96b719d7cd3d872a948d7454f17b816a21c2&r=464b0cfab7c3b6713c35e6f3ae7426542668c77b&format=side&path=/guava/src/com/google/common/base/CharMatcher.java
I seem to be having an issue with Guava's Charmatcher. I am currently writing some JUnit tests and I am noticing the first test takes a bit of time (15+ seconds), but the future tests are all approximately the same (.3 seconds), with a majority of the tests running through the same code.
My analysis of the code has pinpointed that the Charmatcher in Guava's library seems to be the culprit. It looks like there is a bit of static initialization code in CharMatcher that might be the true reason it is taking so long. Is there any way I can disable or optimize this behavior? Is there something I am missing from the wiki pages or help files?
Guava version 13.0.1, System: Linux 64, Eclipse 3.6.1
EDIT: I built a test app with the following code:
import com.google.common.base.CharMatcher;
public class Main {
public static void main(String[] args) {
// Using system instead of stopwatch to isolate library.
long startTime = System.currentTimeMillis();
CharMatcher.is(' ');
long endTime = System.currentTimeMillis();
System.out.println("took " + String.valueOf(endTime-startTime) + " ms");
startTime = System.currentTimeMillis();
CharMatcher.is('d');
endTime = System.currentTimeMillis();
System.out.println("2nd took " + String.valueOf(endTime-startTime) + " ms");
}
}
This resulted in the following output:
took 15945 ms
2nd took 0 ms
I ran this in eclipse minus the JUnit framework and just google's guava library. Also I packaged an executable char and received similar results. I'll come back with a second edit after I run a profiler through it.
Thanks for any help.
Edit 2: Results from profiling:
Main.main(String[]) 22,556 ms
com.google.common.base.CharMatcher.<clinit>() 22.556 ms
com.google.common.base.CharMatcher.precomputed() 22,550 ms
com.google.common.base.Platform.precomputeCharMatcher(CharMatcher) 22,550 ms
com.google.common.base.CharMatcher.precomputedInternal() 22,550 ms
com.google.common.base.CharMatcher.slowGetChars() 13,638 ms
com.google.common.base.CharMatcher.setBits(CharMatcher$LookupTable) 8,911 ms
Update: This is fixed in Guava 14.
You are running into some kind of JIT bug present in u21, and likely earlier versions, but not in at least some later JDK versions. Here's the -XX:+PrintCompilation output for your test program in u21:
1 java.util.Arrays::binarySearch0 (72 bytes)
2 com.google.common.base.CharMatcher$Or::matches (28 bytes)
3 com.google.common.base.CharMatcher$12::matches (22 bytes)
2 made not entrant com.google.common.base.CharMatcher$Or::matches (28 bytes)
4 com.google.common.base.CharMatcher$11::matches (17 bytes)
5 java.util.Arrays::binarySearch (9 bytes)
6 com.google.common.base.CharMatcher$Or::matches (28 bytes)
1% com.google.common.base.CharMatcher::slowGetChars @ 9 (52 bytes)
6 made not entrant com.google.common.base.CharMatcher$Or::matches (28 bytes)
7 com.google.common.base.CharMatcher$Or::matches (28 bytes)
7 made not entrant com.google.common.base.CharMatcher$Or::matches (28 bytes)
1% made not entrant com.google.common.base.CharMatcher::slowGetChars @ -2 (52 bytes)
2% com.google.common.base.CharMatcher::slowGetChars @ 9 (52 bytes)
8 com.google.common.base.CharMatcher$Or::matches (28 bytes)
8 made not entrant com.google.common.base.CharMatcher$Or::matches (28 bytes)
2% made not entrant com.google.common.base.CharMatcher::slowGetChars @ -2 (52 bytes)
3% com.google.common.base.CharMatcher::slowGetChars @ 9 (52 bytes)
9 com.google.common.base.CharMatcher$Or::matches (28 bytes)
9 made not entrant com.google.common.base.CharMatcher$Or::matches (28 bytes)
3% made not entrant com.google.common.base.CharMatcher::slowGetChars @ -2 (52 bytes)
4% com.google.common.base.CharMatcher::slowGetChars @ 9 (52 bytes)
10 com.google.common.base.CharMatcher$Or::matches (28 bytes)
10 made not entrant com.google.common.base.CharMatcher$Or::matches (28 bytes)
4% made not entrant com.google.common.base.CharMatcher::slowGetChars @ -2 (52 bytes)
11 com.google.common.base.CharMatcher$Or::matches (28 bytes)
5% com.google.common.base.CharMatcher::slowGetChars @ 9 (52 bytes)
11 made not entrant com.google.common.base.CharMatcher$Or::matches (28 bytes)
12 com.google.common.base.CharMatcher$Or::matches (28 bytes)
5% made not entrant com.google.common.base.CharMatcher::slowGetChars @ -2 (52 bytes)
12 made not entrant com.google.common.base.CharMatcher$Or::matches (28 bytes)
6% com.google.common.base.CharMatcher::slowGetChars @ 9 (52 bytes)
13 com.google.common.base.CharMatcher$Or::matches (28 bytes)
13 made not entrant com.google.common.base.CharMatcher$Or::matches (28 bytes)
6% made not entrant com.google.common.base.CharMatcher::slowGetChars @ -2 (52 bytes)
7% com.google.common.base.CharMatcher::slowGetChars @ 9 (52 bytes)
14 com.google.common.base.CharMatcher$Or::matches (28 bytes)
15 com.google.common.base.CharMatcher::slowGetChars (52 bytes)
14 made not entrant com.google.common.base.CharMatcher$Or::matches (28 bytes)
7% made not entrant com.google.common.base.CharMatcher::slowGetChars @ -2 (52 bytes)
16 com.google.common.base.CharMatcher$Or::matches (28 bytes)
16 made not entrant com.google.common.base.CharMatcher$Or::matches (28 bytes)
17 com.google.common.base.CharMatcher$Or::matches (28 bytes)
17 made not entrant com.google.common.base.CharMatcher$Or::matches (28 bytes)
8% com.google.common.base.CharMatcher::slowGetChars @ 9 (52 bytes)
18 com.google.common.base.CharMatcher$Or::matches (28 bytes)
18 made not entrant com.google.common.base.CharMatcher$Or::matches (28 bytes)
19 com.google.common.base.CharMatcher$Or::matches (28 bytes)
19 made not entrant com.google.common.base.CharMatcher$Or::matches (28 bytes)
20 com.google.common.base.CharMatcher$Or::matches (28 bytes)
8% made not entrant com.google.common.base.CharMatcher::slowGetChars @ -2 (52 bytes)
20 made not entrant com.google.common.base.CharMatcher$Or::matches (28 bytes)
21 com.google.common.base.CharMatcher$Or::matches (28 bytes)
21 made not entrant com.google.common.base.CharMatcher$Or::matches (28 bytes)
22 com.google.common.base.CharMatcher$Or::matches (28 bytes)
9% com.google.common.base.CharMatcher::slowGetChars @ 9 (52 bytes)
22 made not entrant com.google.common.base.CharMatcher$Or::matches (28 bytes)
23 com.google.common.base.CharMatcher$Or::matches (28 bytes)
23 made not entrant com.google.common.base.CharMatcher$Or::matches (28 bytes)
24 com.google.common.base.CharMatcher$Or::matches (28 bytes)
24 made not entrant com.google.common.base.CharMatcher$Or::matches (28 bytes)
9% made not entrant com.google.common.base.CharMatcher::slowGetChars @ -2 (52 bytes)
25 com.google.common.base.CharMatcher$Or::matches (28 bytes)
25 made not entrant com.google.common.base.CharMatcher$Or::matches (28 bytes)
26 com.google.common.base.CharMatcher$Or::matches (28 bytes)
26 made not entrant com.google.common.base.CharMatcher$Or::matches (28 bytes)
10% com.google.common.base.CharMatcher::slowGetChars @ 9 (52 bytes)
27 com.google.common.base.CharMatcher$Or::matches (28 bytes)
27 made not entrant com.google.common.base.CharMatcher$Or::matches (28 bytes)
28 com.google.common.base.CharMatcher$Or::matches (28 bytes)
28 made not entrant com.google.common.base.CharMatcher$Or::matches (28 bytes)
10% made not entrant com.google.common.base.CharMatcher::slowGetChars @ -2 (52 bytes)
29 com.google.common.base.CharMatcher$Or::matches (28 bytes)
29 made not entrant com.google.common.base.CharMatcher$Or::matches (28 bytes)
30 com.google.common.base.CharMatcher$Or::matches (28 bytes)
30 made not entrant com.google.common.base.CharMatcher$Or::matches (28 bytes)
11% com.google.common.base.CharMatcher::slowGetChars @ 9 (52 bytes)
31 com.google.common.base.CharMatcher$Or::matches (28 bytes)
31 made not entrant com.google.common.base.CharMatcher$Or::matches (28 bytes)
32 com.google.common.base.CharMatcher$Or::matches (28 bytes)
32 made not entrant com.google.common.base.CharMatcher$Or::matches (28 bytes)
11% made not entrant com.google.common.base.CharMatcher::slowGetChars @ -2 (52 bytes)
33 com.google.common.base.CharMatcher$Or::matches (28 bytes)
33 made not entrant com.google.common.base.CharMatcher$Or::matches (28 bytes)
34 com.google.common.base.CharMatcher$Or::matches (28 bytes)
34 made not entrant com.google.common.base.CharMatcher$Or::matches (28 bytes)
35 com.google.common.base.CharMatcher$Or::matches (28 bytes)
12% com.google.common.base.CharMatcher::slowGetChars @ 9 (52 bytes)
35 made not entrant com.google.common.base.CharMatcher$Or::matches (28 bytes)
36 com.google.common.base.CharMatcher$Or::matches (28 bytes)
36 made not entrant com.google.common.base.CharMatcher$Or::matches (28 bytes)
37 com.google.common.base.CharMatcher$Or::matches (28 bytes)
37 made not entrant com.google.common.base.CharMatcher$Or::matches (28 bytes)
12% made not entrant com.google.common.base.CharMatcher::slowGetChars @ -2 (52 bytes)
38 com.google.common.base.CharMatcher$Or::matches (28 bytes)
38 made not entrant com.google.common.base.CharMatcher$Or::matches (28 bytes)
39 com.google.common.base.CharMatcher$Or::matches (28 bytes)
39 made not entrant com.google.common.base.CharMatcher$Or::matches (28 bytes)
13% com.google.common.base.CharMatcher::slowGetChars @ 9 (52 bytes)
40 com.google.common.base.CharMatcher$Or::matches (28 bytes)
40 made not entrant com.google.common.base.CharMatcher$Or::matches (28 bytes)
41 com.google.common.base.CharMatcher$Or::matches (28 bytes)
41 made not entrant com.google.common.base.CharMatcher$Or::matches (28 bytes)
13% made not entrant com.google.common.base.CharMatcher::slowGetChars @ -2 (52 bytes)
42 com.google.common.base.CharMatcher$Or::matches (28 bytes)
42 made not entrant com.google.common.base.CharMatcher$Or::matches (28 bytes)
43 com.google.common.base.CharMatcher$Or::matches (28 bytes)
43 made not entrant com.google.common.base.CharMatcher$Or::matches (28 bytes)
44 com.google.common.base.CharMatcher$Or::matches (28 bytes)
14% com.google.common.base.CharMatcher::slowGetChars @ 9 (52 bytes)
44 made not entrant com.google.common.base.CharMatcher$Or::matches (28 bytes)
45 com.google.common.base.CharMatcher$Or::matches (28 bytes)
45 made not entrant com.google.common.base.CharMatcher$Or::matches (28 bytes)
46 com.google.common.base.CharMatcher$Or::matches (28 bytes)
46 made not entrant com.google.common.base.CharMatcher$Or::matches (28 bytes)
14% made not entrant com.google.common.base.CharMatcher::slowGetChars @ -2 (52 bytes)
47 com.google.common.base.CharMatcher$Or::matches (28 bytes)
47 made not entrant com.google.common.base.CharMatcher$Or::matches (28 bytes)
48 com.google.common.base.CharMatcher$Or::matches (28 bytes)
48 made not entrant com.google.common.base.CharMatcher$Or::matches (28 bytes)
15% com.google.common.base.CharMatcher::slowGetChars @ 9 (52 bytes)
49 com.google.common.base.CharMatcher$Or::matches (28 bytes)
49 made not entrant com.google.common.base.CharMatcher$Or::matches (28 bytes)
50 com.google.common.base.CharMatcher$Or::matches (28 bytes)
50 made not entrant com.google.common.base.CharMatcher$Or::matches (28 bytes)
15% made not entrant com.google.common.base.CharMatcher::slowGetChars @ -2 (52 bytes)
51 com.google.common.base.CharMatcher$Or::matches (28 bytes)
51 made not entrant com.google.common.base.CharMatcher$Or::matches (28 bytes)
52 com.google.common.base.CharMatcher$Or::matches (28 bytes)
52 made not entrant com.google.common.base.CharMatcher$Or::matches (28 bytes)
16% com.google.common.base.CharMatcher::slowGetChars @ 9 (52 bytes)
53 com.google.common.base.CharMatcher$Or::matches (28 bytes)
53 made not entrant com.google.common.base.CharMatcher$Or::matches (28 bytes)
54 com.google.common.base.CharMatcher$Or::matches (28 bytes)
54 made not entrant com.google.common.base.CharMatcher$Or::matches (28 bytes)
16% made not entrant com.google.common.base.CharMatcher::slowGetChars @ -2 (52 bytes)
55 com.google.common.base.CharMatcher$Or::matches (28 bytes)
55 made not entrant com.google.common.base.CharMatcher$Or::matches (28 bytes)
56 com.google.common.base.CharMatcher$Or::matches (28 bytes)
56 made not entrant com.google.common.base.CharMatcher$Or::matches (28 bytes)
57 com.google.common.base.CharMatcher$Or::matches (28 bytes)
17% com.google.common.base.CharMatcher::slowGetChars @ 9 (52 bytes)
57 made not entrant com.google.common.base.CharMatcher$Or::matches (28 bytes)
58 com.google.common.base.CharMatcher$Or::matches (28 bytes)
58 made not entrant com.google.common.base.CharMatcher$Or::matches (28 bytes)
59 com.google.common.base.CharMatcher$Or::matches (28 bytes)
59 made not entrant com.google.common.base.CharMatcher$Or::matches (28 bytes)
17% made not entrant com.google.common.base.CharMatcher::slowGetChars @ -2 (52 bytes)
60 com.google.common.base.CharMatcher$Or::matches (28 bytes)
60 made not entrant com.google.common.base.CharMatcher$Or::matches (28 bytes)
61 com.google.common.base.CharMatcher$Or::matches (28 bytes)
61 made not entrant com.google.common.base.CharMatcher$Or::matches (28 bytes)
18% com.google.common.base.CharMatcher::slowGetChars @ 9 (52 bytes)
62 com.google.common.base.CharMatcher$Or::matches (28 bytes)
62 made not entrant com.google.common.base.CharMatcher$Or::matches (28 bytes)
63 com.google.common.base.CharMatcher$Or::matches (28 bytes)
63 made not entrant com.google.common.base.CharMatcher$Or::matches (28 bytes)
18% made not entrant com.google.common.base.CharMatcher::slowGetChars @ -2 (52 bytes)
64 com.google.common.base.CharMatcher$Or::matches (28 bytes)
64 made not entrant com.google.common.base.CharMatcher$Or::matches (28 bytes)
65 com.google.common.base.CharMatcher$Or::matches (28 bytes)
65 made not entrant com.google.common.base.CharMatcher$Or::matches (28 bytes)
19% com.google.common.base.CharMatcher::slowGetChars @ 9 (52 bytes)
66 com.google.common.base.CharMatcher$Or::matches (28 bytes)
<I omitted ~500 very similar lines here>
143% made not entrant com.google.common.base.CharMatcher::setBits @ -2 (30 bytes)
144% com.google.common.base.CharMatcher::setBits @ 2 (30 bytes)
144% made not entrant com.google.common.base.CharMatcher::setBits @ -2 (30 bytes)
145% com.google.common.base.CharMatcher::setBits @ 2 (30 bytes)
145% made not entrant com.google.common.base.CharMatcher::setBits @ -2 (30 bytes)
took 7599 ms
2nd took 0 ms
Here's the same output for u34:
64 1 java.util.Arrays::binarySearch0 (72 bytes)
68 2 com.google.common.base.CharMatcher$Or::matches (28 bytes)
68 3 com.google.common.base.CharMatcher$12::matches (22 bytes)
70 4 com.google.common.base.CharMatcher$11::matches (17 bytes)
71 5 java.util.Arrays::binarySearch (9 bytes)
71 1% com.google.common.base.CharMatcher::slowGetChars @ 9 (52 bytes)
76 2 made not entrant com.google.common.base.CharMatcher$Or::matches (28 bytes)
76 6 com.google.common.base.CharMatcher$Or::matches (28 bytes)
88 1% made not entrant com.google.common.base.CharMatcher::slowGetChars @ -2 (52 bytes)
88 6 made not entrant com.google.common.base.CharMatcher$Or::matches (28 bytes)
88 7 com.google.common.base.CharMatcher::slowGetChars (52 bytes)
89 8 com.google.common.base.CharMatcher$Or::matches (28 bytes)
91 9 com.google.common.base.CharMatcher$8::matches (14 bytes)
91 2% com.google.common.base.CharMatcher::slowGetChars @ 9 (52 bytes)
98 10 java.lang.String::indexOf (166 bytes)
98 11 java.lang.String::indexOf (151 bytes)
102 3% com.google.common.base.CharMatcher::setBits @ 2 (30 bytes)
113 12 com.google.common.base.CharMatcher::setBits (30 bytes)
113 13 com.google.common.base.CharMatcher$LookupTable::set (15 bytes)
As you can see, it is a hell of a lot more sane in u34. It looks like what is happening in u21 is that some combination of OSR (affecting slowGetChars
method as it has a large loop), and recursive megamorphic calls (affecting CharMatcher$Or
and other subclasses of CharMatcher
- which are called at the end of the recursive CharMatcher$Or
chain) produce a perfect storm of JIT de-optimization and subsequent recompilation, which is resolved in u34.
Still, I think the Guava guys are being a bit abusive by creating a 65k element LUT during static init, even if you never use the matchers in question, which takes still about ~50ms when JIT is working properly. I filed a bug against guava, let's see if the maintainers agree :). Update: the bug was fixed within a couple of days of me filing this, the fix will be in Guava 14. That's what I call quick turnaround!
Original post follows:
I have the exact same behavior and ran into this post looking for a solution.
Looking at the code, it seems that they are making a 65k element (each element takes one bit) lookup table and checking if every character matches, which in this requires many nested function calls due to the fluent style being used to define the matchers:
public static final CharMatcher INVISIBLE = inRange('\u0000', '\u0020')
.or(inRange('\u007f', '\u00a0'))
.or(is('\u00ad'))
.or(inRange('\u0600', '\u0604'))
.or(anyOf("\u06dd\u070f\u1680\u180e"))
.or(inRange('\u2000', '\u200f'))
.or(inRange('\u2028', '\u202f'))
.or(inRange('\u205f', '\u2064'))
.or(inRange('\u206a', '\u206f'))
.or(is('\u3000'))
.or(inRange('\ud800', '\uf8ff'))
.or(anyOf("\ufeff\ufff9\ufffa\ufffb"))
.withToString("CharMatcher.INVISIBLE")
.precomputed();
That's the code that is triggering the long static init, which on my fast box takes ~7 seconds. Here's the output of your app on my box:
took 6814 ms
2nd took 0 ms
I'm filing a guava bug.
As a quick help, try to switch to the newest version from the source repo. There are changes there, e.g. the method slowGetChars
is gone. It used to be a bit slow, but nowhere near what's happening to you.
For everybody else it runs some hundred times faster and without being able to reproduce it it's hard to help.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With