Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Performance of exhaustive Haskell pattern-matching

I am wondering how Haskell pattern matching is resolved internally and how this impacts performance. Say we have a function that is expensive to compute so we precompute the first and / or more frequently used values and pattern match on the corresponding input before making the actual computation itself:

expensiveComputation :: Int -> Int
expensiveComputation 1 = 101
expensiveComputation 2 = 3333333
expensiveComputation 3 = 42
...
expensiveComputation n = the actual function

We could precompute a lot of these cases, thousands if we wanted, but do we? I'm assuming that it takes time for Haskell to actually find the pattern that matches the input, so maybe it is actually quicker to compute the, say, 1000th value instead of making a 1000 pattern matches?

like image 837
Petras Purlys Avatar asked Feb 01 '18 21:02

Petras Purlys


People also ask

Does Haskell have pattern matching?

Overview. We use pattern matching in Haskell to simplify our codes by identifying specific types of expression. We can also use if-else as an alternative to pattern matching. Pattern matching can also be seen as a kind of dynamic polymorphism where, based on the parameter list, different methods can be executed.

How does pattern matching work Haskell?

Pattern matching consists of specifying patterns to which some data should conform and then checking to see if it does and deconstructing the data according to those patterns. When defining functions, you can define separate function bodies for different patterns.


1 Answers

I made a simple test in the form:

f 0 = 4
f 1 = 5
...

And compiled with ghc -O2 -ddump-asm -c. I observe key ASM of:

What appears to be implementations for each top level equation:

_c2aD:
        movl $28,%ebx    ; N.B. the constant 28 matches my largest value
        jmp *(%rbp)
_c2aC:
        movl $27,%ebx
        jmp *(%rbp)
_c2aB:
        movl $26,%ebx
        jmp *(%rbp)
....

What appears to be a jump table referencing the above implementations:

_n2aN:
        .quad   _c2af-(_n2aN)+0
        .quad   _c2ag-(_n2aN)+0
        .quad   _c2ah-(_n2aN)+0
        .quad   _c2ai-(_n2aN)+0
        .quad   _c2aj-(_n2aN)+0
        .quad   _c2ak-(_n2aN)+0
        ...

What appears to be a pair of range guards followed by a dispatch:

_c2aE:
        cmpq $25,%r14        ; N.B. the last defined input is 24
        jge _c2ae
_u2aH:
        testq %r14,%r14
        jl _c2ae
_u2aI:
        leaq _n2aN(%rip),%rax
        addq (%rax,%r14,8),%rax
        jmp *%rax

So do I think this will be slow for 1000 entries? No, if they are contiguous I bet GHC will generate a jump table just like I'm seeing here. Another interesting test would be if they aren't contiguous.

like image 199
Thomas M. DuBuisson Avatar answered Dec 11 '22 04:12

Thomas M. DuBuisson