I ran into an unexpected code optimization recently, and wanted to check whether my interpretation of what I was observing was correct. The following is a much simplified example of the situation:
let demo =
let swap fst snd i =
if i = fst then snd else
if i = snd then fst else
i
[ for i in 1 .. 10000 -> swap 1 i i ]
let demo2 =
let swap (fst: int) snd i =
if i = fst then snd else
if i = snd then fst else
i
[ for i in 1 .. 10000 -> swap 1 i i ]
The only difference between the 2 blocks of code is that in the second case, I explicitly declare the arguments of swap as integers. Yet, when I run the 2 snippets in fsi with #time, I get:
Case 1 Real: 00:00:00.011, CPU: 00:00:00.000, GC gen0: 0, gen1: 0, gen2: 0
Case 2 Real: 00:00:00.004, CPU: 00:00:00.015, GC gen0: 0, gen1: 0, gen2: 0
i.e. the 2nd snippet runs 3 times faster than the first. The absolute performance difference here is obviously not an issue, but if I were to use the swap function a lot, it would pile up.
My assumption is that the reason for the performance hit is that in the first case, swap is generic and "requires equality", and checks whether int supports it, whereas the second case doesn't have to check anything. Is this the reason this is happening, or am I missing something else? And more generally, should I consider automatic generalization a double-edged sword, that is, an awesome feature which may have unexpected effects on performance?
I think this is generally the same case as in the question Why is this F# code so slow. In that question, the performance issue is caused by a constraint requiring comparison
and in your case, it is caused by the equality
constraint.
In both cases, the compiled generic code has to use interfaces (and boxing), while the specialized compiled code can directly use IL instructions for comparison or equality of integers or floating-point numbers.
The two ways to avoid the performance issue are:
int
or float
as you didinline
so that the compiler specializes it automaticallyFor smaller functions, the second approach is better, because it does not generate too much code and you can still write functions in a generic way. If you use function just for single type (by design), then using the first approach is probably appropriate.
The reason for the difference is that the compiler is generating calls to
IL_0002: call bool class [FSharp.Core]Microsoft.FSharp.Core.LanguagePrimitives/HashCompare::GenericEqualityIntrinsic<!!0> (!!0, !!0)
in the generic version, whilst the int
version can just compare directly.
If you used inline
I suspect that this problem would disappear as the compiler now has extra type information
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With