Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Short-lived memoization in Haskell?

In an object-oriented language when I need to cache/memoize the results of a function for a known life-time I'll generally follow this pattern:

  1. Create a new class
  2. Add to the class a data member and a method for each function result I want to cache
  3. Implement the method to first check to see if the result has been stored in the data member. If so, return that value; else call the function (with the appropriate arguments) and store the returned result in the data member.
  4. Objects of this class will be initialized with values that are needed for the various function calls.

This object-based approach is very similar to the function-based memoization pattern described here: http://www.bardiak.com/2012/01/javascript-memoization-pattern.html

The main benefit of this approach is that the results are kept around only for the life time of the cache object. A common use case is in the processing of a list of work items. For each work item one creates the cache object for that item, processes the work item with that cache object then discards the work item and cache object before proceeding to the next work item.

What are good ways to implement short-lived memoization in Haskell? And does the answer depend on if the functions to be cached are pure or involve IO?

Just to reiterate - it would be nice to see solutions for functions which involve IO.

like image 707
ErikR Avatar asked Feb 24 '12 20:02

ErikR


1 Answers

Let's use Luke Palmer's memoization library: Data.MemoCombinators

import qualified Data.MemoCombinators as Memo
import Data.Function (fix) -- we'll need this too

I'm going to define things slightly different from how his library does, but it's basically the same (and furthermore, compatible). A "memoizable" thing takes itself as input, and produces the "real" thing.

type Memoizable a = a -> a

A "memoizer" takes a function and produces the memoized version of it.

type Memoizer a b = (a -> b) -> a -> b

Let's write a little function to put these two things together. Given a Memoizable function and a Memoizer, we want the resultant memoized function.

runMemo :: Memoizer a b -> Memoizable (a -> b) -> a -> b
runMemo memo f = fix (f . memo)

This is a little magic using the fixpoint combinator (fix). Never mind that; you can google it if you are interested.

So let's write a Memoizable version of the classic fib example:

fib :: Memoizable (Integer -> Integer)
fib self = go
  where go 0 = 1
        go 1 = 1
        go n = self (n-1) + self (n-2)

Using a self convention makes the code straightforward. Remember, self is what we expect to be the memoized version of this very function, so recursive calls should be on self. Now fire up ghci.

ghci> let fib' = runMemo Memo.integral fib
ghci> fib' 10000
WALL OF NUMBERS CRANKED OUT RIDICULOUSLY FAST

Now, the cool thing about runMemo is you can create more than one freshly memoized version of the same function, and they will not share memory banks. That means that I can write a function that locally creates and uses fib', but then as soon as fib' falls out of scope (or earlier, depending on the intelligence of the compiler), it can be garbage collected. It doesn't have to be memoized at the top level. This may or may not play nicely with memoization techniques that rely on unsafePerformIO. Data.MemoCombinators uses a pure, lazy Trie, which fits perfectly with runMemo. Rather than creating an object which essentially becomes a memoization manager, you can simply create memoized functions on demand. The catch is that if your function is recursive, it must be written as Memoizable. The good news is you can plug in any Memoizer that you wish. You could even use:

noMemo :: Memoizer a b
noMemo f = f

ghci> let fib' = runMemo noMemo fib
ghci> fib' 30 -- wait a while; it's computing stupidly
1346269
like image 186
Dan Burton Avatar answered Nov 18 '22 17:11

Dan Burton