I am wondering how Haskell pattern matching is resolved internally and how this impacts performance. Say we have a function that is expensive to compute so we precompute the first and / or more frequently used values and pattern match on the corresponding input before making the actual computation itself:
expensiveComputation :: Int -> Int
expensiveComputation 1 = 101
expensiveComputation 2 = 3333333
expensiveComputation 3 = 42
...
expensiveComputation n = the actual function
We could precompute a lot of these cases, thousands if we wanted, but do we? I'm assuming that it takes time for Haskell to actually find the pattern that matches the input, so maybe it is actually quicker to compute the, say, 1000th value instead of making a 1000 pattern matches?
Overview. We use pattern matching in Haskell to simplify our codes by identifying specific types of expression. We can also use if-else as an alternative to pattern matching. Pattern matching can also be seen as a kind of dynamic polymorphism where, based on the parameter list, different methods can be executed.
Pattern matching consists of specifying patterns to which some data should conform and then checking to see if it does and deconstructing the data according to those patterns. When defining functions, you can define separate function bodies for different patterns.
I made a simple test in the form:
f 0 = 4
f 1 = 5
...
And compiled with ghc -O2 -ddump-asm -c
. I observe key ASM of:
What appears to be implementations for each top level equation:
_c2aD:
movl $28,%ebx ; N.B. the constant 28 matches my largest value
jmp *(%rbp)
_c2aC:
movl $27,%ebx
jmp *(%rbp)
_c2aB:
movl $26,%ebx
jmp *(%rbp)
....
What appears to be a jump table referencing the above implementations:
_n2aN:
.quad _c2af-(_n2aN)+0
.quad _c2ag-(_n2aN)+0
.quad _c2ah-(_n2aN)+0
.quad _c2ai-(_n2aN)+0
.quad _c2aj-(_n2aN)+0
.quad _c2ak-(_n2aN)+0
...
What appears to be a pair of range guards followed by a dispatch:
_c2aE:
cmpq $25,%r14 ; N.B. the last defined input is 24
jge _c2ae
_u2aH:
testq %r14,%r14
jl _c2ae
_u2aI:
leaq _n2aN(%rip),%rax
addq (%rax,%r14,8),%rax
jmp *%rax
So do I think this will be slow for 1000 entries? No, if they are contiguous I bet GHC will generate a jump table just like I'm seeing here. Another interesting test would be if they aren't contiguous.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With