I've done some benchmarks and have results that I don't know how to explain.
I have 2 classes doing the same (computation heavy) thing with generic arrays, both of them use specialization (@specialized
, later @spec
). One class is defined like:
class A [@spec T] {
def a(d: Array[T], c: Whatever[T], ...) = ...
...
}
Second: (singleton)
object B {
def a[@spec T](d: Array[T], c: Whatever[T], ...) = ...
...
}
And in the second case, I get huge performance hit. Why this can happen? (Note: at the moment I don't understand Java bytecode very well, and Scala compiler internals too.)
Full code is here: https://github.com/magicgoose/trashbox/tree/master/sorting_tests/src/magicgoose/sorting
This is sorting algorithm ripped from Java, (almost)automagically converted to Scala and comparison operations changed to generic ones to allow using custom comparisons with primitive types without boxing. Plus simple benchmark (tests on different lengths, with JVM warmup and averaging)
The results looks like: (left column is original Java Arrays.sort(int[])
)
JavaSort | JavaSortGen$mcI$sp | JavaSortGenSingleton$mcI$sp
length 2 | time 0.00003ms | length 2 | time 0.00004ms | length 2 | time 0.00006ms
length 3 | time 0.00003ms | length 3 | time 0.00005ms | length 3 | time 0.00011ms
length 4 | time 0.00005ms | length 4 | time 0.00006ms | length 4 | time 0.00017ms
length 6 | time 0.00008ms | length 6 | time 0.00010ms | length 6 | time 0.00036ms
length 9 | time 0.00013ms | length 9 | time 0.00015ms | length 9 | time 0.00069ms
length 13 | time 0.00022ms | length 13 | time 0.00028ms | length 13 | time 0.00135ms
length 19 | time 0.00037ms | length 19 | time 0.00040ms | length 19 | time 0.00245ms
length 28 | time 0.00072ms | length 28 | time 0.00060ms | length 28 | time 0.00490ms
length 42 | time 0.00127ms | length 42 | time 0.00096ms | length 42 | time 0.01018ms
length 63 | time 0.00173ms | length 63 | time 0.00179ms | length 63 | time 0.01052ms
length 94 | time 0.00280ms | length 94 | time 0.00280ms | length 94 | time 0.01522ms
length 141 | time 0.00458ms | length 141 | time 0.00479ms | length 141 | time 0.02376ms
length 211 | time 0.00731ms | length 211 | time 0.00763ms | length 211 | time 0.03648ms
length 316 | time 0.01310ms | length 316 | time 0.01436ms | length 316 | time 0.06333ms
length 474 | time 0.02116ms | length 474 | time 0.02158ms | length 474 | time 0.09121ms
length 711 | time 0.03250ms | length 711 | time 0.03387ms | length 711 | time 0.14341ms
length 1066 | time 0.05099ms | length 1066 | time 0.05305ms | length 1066 | time 0.21971ms
length 1599 | time 0.08040ms | length 1599 | time 0.08349ms | length 1599 | time 0.33692ms
length 2398 | time 0.12971ms | length 2398 | time 0.13084ms | length 2398 | time 0.51396ms
length 3597 | time 0.20300ms | length 3597 | time 0.20893ms | length 3597 | time 0.79176ms
length 5395 | time 0.32087ms | length 5395 | time 0.32491ms | length 5395 | time 1.30021ms
The latter is the one defined inside object
and it's awful (about 4 times slower).
I've run benchmark with and without scalac optimise
option, and there are no noticeable differences (only slower compilation with optimise
).
It's just one of many bugs in specialization--I'm not sure whether this one's been reported on the bug tracker or not. If you throw an exception from your sort, you'll see that it calls the generic version not the specialized version of the second sort:
java.lang.Exception: Boom!
at magicgoose.sorting.DualPivotQuicksortGenSingleton$.magicgoose$sorting$DualPivotQuicksortGenSingleton$$sort(DualPivotQuicksortGenSingleton.scala:33)
at magicgoose.sorting.DualPivotQuicksortGenSingleton$.sort$mFc$sp(DualPivotQuicksortGenSingleton.scala:13)
Note how the top thing on the stack is DualPivotQuicksortGenSingleton$$sort(...)
instead of ...sort$mFc$sp(...)
? Bad compiler, bad!
As a workaround, you can wrap your private methods inside a final helper object, e.g.
def sort[@ spec T](a: Array[T]) { Helper.sort(a,0,a.length) }
private final object Helper {
def sort[@spec T](a: Array[T], i0: Int, i1: Int) { ... }
}
For whatever reason, the compiler then realizes that it ought to call the specialized variant. I haven't tested whether every specialized method that is called by another needs to be inside its own object; I'll leave that to you via the exception-throwing trick.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With