Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What are the locality properties of Haskell?

Modern day CPUs are optimised so that access and modification of the same place in memory (temporal locality), as well as consecutive places in memory (spatial locality) are extremely fast operations.

Now, since Haskell is a purely immutable language, you naturally can't overwrite existing memory blocks, potentially making things like foldl much slower than a for loop with a continuously accessed result variable would be in C.

Does Haskell do anything internally to mitigate this performance loss? And in general, what are its properties regarding locality?

like image 869
Electric Coffee Avatar asked Apr 24 '15 09:04

Electric Coffee


2 Answers

The general rule is that for "vanilla" Haskell programming you get very little (if any) control over memory layout and memory locality.

However, there do exist a number of more advanced features that allow such control, and libraries that expose friendly abstractions on top of these. The vector library is probably the most popular of the latter. This library provides several fixed-size array types, two of which (Data.Vector.Unboxed and Data.Vector.Storable) give you data locality by representing vectors and their contents as contiguous memory arrays. Data.Vector.Unboxed even contains a simple automatic "structure of arrays" transformation—an unboxed vector of pairs will be represented as a pair of unboxed vectors, one for each of the pairs' components.

Another example is the JuicyPixels library for image processing, which represents images in memory as contiguous bitmaps. This actually bottoms out to Data.Vector.Storable, which exploits a standard facility (Foreign.Storable) for translating user-defined Haskell data types to and from raw bytes.

But the general pattern is this: in Haskell, when you're interested in memory locality, you identify which data needs to benefit from it and bundle it together in a custom data type whose implementation was designed to provide locality and performance guarantees. Writing such a data type is an advanced undertaking, but most of the legwork has been done already in a reusable fashion (note for example that JuicyPixels mostly just reuses vector).

Note also that:

  1. vector provides stream fusion optimizations to eliminate intermediate arrays when you apply nested vector transformations. If you generate a vector from 0 to 1,000,000, filter out the even numbers, map the (^2) function over that and sum the elements of the result, no array is ever allocated—the library has the smarts to rewrite that to an accumulator loop from 0 to 1,000,000. So a foldl of a vector isn't necessarily slower than a for loop—there might be no array at all!
  2. vector provides mutable arrays as well. More generally, in Haskell you can overwrite existing memory if you really insist. It's just (a) not the default paradigm in the language, and therefore (b) a little bit clunky, but absolutely tractable if you just need it in a few performance-sensitive spots.

So most of the time, the answer to "I want memory locality" is "use vector."

like image 83
Luis Casillas Avatar answered Nov 10 '22 16:11

Luis Casillas


Haskell is an extremely high-level language, and you're asking a question about an extremely low-level detail.

Overall, Haskell's performance is probably similar to any garbage-collected language like Java or C#. In particular, Haskell has mutable arrays, which will have performance similar to any other array. (You may need unboxed arrays to match C performance.)

For something like a fold, if the final result is something like a machine integer, that probably ends up in a processor register for the entire duration of the loop. So the final machine code is pretty much identical to “a continuously-accessed variable in C”. (If the result is a dictionary or something, then probably not. But that's the same as C as well.)

More generally, if locallity is something that matters to you, any garbage-collected language probably isn't your friend. But, again, you can use unboxed arrays to work around that.

All of this talks is great and all, but if you really want to know how fast a specific Haskell program is, benchmark it. It turns out well-written Haskell programs are usually quite fast. (Just like most compiled languages.)

Added: You can ask GHC to output partially-compiled code in Core format, which is lower-level than Haskell but higher-level than machine code. This lets you see what the compiler has decided to do (in particular, where stuff has been inlined, where abstractions have been removed, etc.) This can help you find out what the final code looks like, without having to go all the way down to machine code.

like image 10
MathematicalOrchid Avatar answered Nov 10 '22 16:11

MathematicalOrchid