Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Repa performance versus lists

Tags:

haskell

ghc

repa

In the Numeric Haskell Repa Tutorial Wiki, there is a passage that reads (for context):

10.1 Fusion, and why you need it

Repa depends critically on array fusion to achieve fast code. Fusion is a fancy name for the combination of inlining and code transformations performed by GHC when it compiles your program. The fusion process merges the array filling loops defined in the Repa library, with the "worker" functions that you write in your own module. If the fusion process fails, then the resulting program will be much slower than it needs to be, often 10x slower an equivalent program using plain Haskell lists. On the other hand, provided fusion works, the resulting code will run as fast as an equivalent cleanly written C program. Making fusion work is not hard once you understand what's going on.

The part that I don't understand is this:

"If the fusion process fails, then the resulting program will be much slower than it needs to be, often 10x slower an equivalent program using plain Haskell lists."

I understand why it would run slower if stream fusion fails, but why does it run that much slower than lists?

Thanks!

like image 462
Charles Durham Avatar asked Dec 16 '22 16:12

Charles Durham


2 Answers

Typically, because lists are lazy, and Repa arrays are strict.

If you fail to fuse a lazy list traversal, e.g.

map f . map g

you pay O(1) cost per value for leaving the intermediate (lazy) cons cell there.

If you fail to fuse the same traversal over a strict sequence, you pay at least O(n) per value for allocating a strict intermediate array.

Also, since fusion mangles your code into an unrecognizable Stream data type, to improve analysis, you can be left with code that has just too many constructors and other overheads.

like image 148
Don Stewart Avatar answered Dec 31 '22 14:12

Don Stewart


Edit: This is not correct--see Don Nelson's comment (and his answer--he knows a lot more about the library than I do).

Immutable arrays cannot share components; disregarding fusion, any modification to an immutable array must reallocate the entire array. By contrast, while list operations are non-destructive, they can share parts: f i (h:t) = i:t, for example, replaces the head of a list in constant time by creating a new list in which the first cell points to the second cell of the original list. Moreover, because lists can be built incrementally, such functions as generators that build a list by repeated calls to a function can still run in O(n) time, while the equivalent function on an immutable array without fusion would need to reallocate the array with every call to the function, taking O(n^2) time.

like image 23
isturdy Avatar answered Dec 31 '22 14:12

isturdy