Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Idiomatic efficient Haskell append?

List and the cons operator (:) are very common in Haskell. Cons is our friend. But sometimes I want to add to the end of a list instead.

xs `append` x = xs ++ [x]

This, sadly, is not an efficient way to implement it.

I wrote up Pascal's triangle in Haskell, but I had to use the ++ [x] anti-idiom:

ptri = [1] : mkptri ptri
mkptri (row:rows) = newRow : mkptri rows
    where newRow = zipWith (+) row (0:row) ++ [1]

imho, this is a lovely readable Pascal's triangle and all, but the anti-idiom irks me. Can someone explain to me (and, ideally, point me to a good tutorial) on what the idiomatic data structure is for cases where you want to append to the end efficiently? I'm hoping for near-list-like beauty in this data structure and its methods. Or, alternately, explain to me why this anti-idiom is actually not that bad for this case (if you believe such to be the case).


[edit] The answer I like the best is Data.Sequence, which does indeed have "near-list-like beauty." Not sure how I feel about the required strictness of operations. Further suggestions and different ideas are always welcome.

import Data.Sequence ((|>), (<|), zipWith, singleton)
import Prelude hiding (zipWith)

ptri = singleton 1 : mkptri ptri

mkptri (seq:seqs) = newRow : mkptri seqs
    where newRow = zipWith (+) seq (0 <| seq) |> 1

Now we just need List to be a class, so that other structures can use its methods like zipWith without hiding it from Prelude, or qualifying it. :P

like image 946
Dan Burton Avatar asked Mar 04 '11 00:03

Dan Burton


4 Answers

Keep in mind that what looks poor asymptotics might actually not be, because you are working in a lazy language. In a strict language, appending to the end of a linked list in this way would always be O(n). In a lazy language, it's O(n) only if you actually traverse to the end of the list,in which case you would have spent O(n) effort anyway. So in many cases, laziness saves you.

This isn't a guarantee... for example, k appends followed by a traversal will still run in O(nk) where it could have been O(n+k). But it does change the picture somewhat. Thinking about performance of single operations in terms of their asymptotic complexity when the result is immediately forced doesn't always give you the right answer in the end.

like image 62
Chris Smith Avatar answered Nov 01 '22 23:11

Chris Smith


Standard Sequence has O(1) for addition from 'both ends' and O(log(min(n1,n2))) for general concatenation:

http://hackage.haskell.org/packages/archive/containers/latest/doc/html/Data-Sequence.html

The difference from lists though is that Sequence is strict

like image 17
Ed'ka Avatar answered Nov 02 '22 00:11

Ed'ka


Something like this explicit recursion avoids your append "anti-idiom". Although, I don't think it is as clear as your example.

ptri = []:mkptri ptri
mkptri (xs:ys) = pZip xs (0:xs) : mkptri ys
    where pZip (x:xs) (y:ys) = x+y : pZip xs ys
          pZip [] _ = [1]
like image 10
David Powell Avatar answered Nov 02 '22 00:11

David Powell


In your code for Pascal's Triangle, ++ [x] is not actually a problem. Since you have to produce a new list on the left hand side of ++ anyway, your algorithm is inherently quadratic; you cannot make it asymptotically faster merely by avoiding ++.

Also, in this particular case, when you compile -O2, GHC's list fusion rules (should) eliminate the copy of the list that ++ would normally create. This is because zipWith is a good producer and ++ is a good consumer in it's first argument. You can read about these optimizations in GHC User's Guide.

like image 8
lpsmith Avatar answered Nov 01 '22 23:11

lpsmith