Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Finding the complexity of Haskell functions

How can I find the complexity (in terms of big-O) for different Haskell functions?

For example, what is the complexity of subsequences?

like image 761
m0nhawk Avatar asked Dec 02 '13 17:12

m0nhawk


1 Answers

You can only calculate the exact complexity of a function by looking at the code. However, you can estimate it using criterion.

For example, the following program estimates the complexity of subsequence as a function of the length of the list.

module Main where

import Criterion (bench, nf)
import Criterion.Main (defaultMain)
import Data.List (subsequences)
import Control.DeepSeq (deepseq)

main = defaultMain (map createBenchmark [0, 2 .. 24])
    where
        createBenchmark n =
            let
                xs = replicate n 'x'
            in
                xs `deepseq` (bench (show n) $ nf subsequences xs)

If you compile it (with -O2!) and run it with

./Main -u report

(or

./Main --csv report

in newer versions of criteron)

you'll get a CSV file with the data (mean time, variance, and other information per run).

If you plot that data you will realise it is exponential in n as the following gnuplot session shows.

> f(x) = a * exp(b * x)
> fit f(x) 'report' using ($0 * 2):2 every ::2 via a,b
...

Final set of parameters            Asymptotic Standard Error
=======================            ==========================

a               = 1.7153e-07       +/- 5.441e-09    (3.172%)
b               = 0.711104         +/- 0.001438     (0.2023%)


correlation matrix of the fit parameters:

               a      b      
a               1.000 
b              -1.000  1.000
> set log y
> set xlabel 'n'
> set ylabel 'mean time [s]'
> plot 'report' using ($0 * 2):2 every ::2 with lp title 'data', f(x) title 'fit'

plot

a is approximately zero, and b has almost no error. So it's a pretty sure guess that complexity is O(2^n), because e^0.71 is almost exactly 2.

Of course, this technique assumes that you're actually using everything returned by the function. If you're only accessing the first element of the returned list, the complexity will be O(1) because of lazy evaluation.

You can probably find a way to make this program independent of the function to benchmark (at least for functions that just take a list). There are also some nice Haskell libraries for plotting data, so you don't need to rely on external programs (unfortunately, as a scientist I never used anything but gnuplot).

like image 136
Tobias Brandt Avatar answered Oct 14 '22 00:10

Tobias Brandt