Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Memoizing multiplication

My application multiplies vectors after a (costly) conversion using an FFT. As a result, when I write

f :: (Num a) => a -> [a] -> [a]
f c xs = map (c*) xs

I only want to compute the FFT of c once, rather than for every element of xs. There really isn't any need to store the FFT of c for the entire program, just in the local scope.

I attempted to define my Num instance like:

data Foo = Scalar c
         | Vec Bool v -- the bool indicates which domain v is in

instance Num Foo where
    (*) (Scalar c) = \x -> case x of
                         Scalar d -> Scalar (c*d)
                         Vec b v-> Vec b $ map (c*) v
    (*) v1 = let Vec True v = fft v1
             in \x -> case x of
                    Scalar d -> Vec True $ map (c*) v
                    v2 -> Vec True $ zipWith (*) v (fft v2)

Then, in an application, I call a function similar to f (which works on arbitrary Nums) where c=Vec False v, and I expected that this would be just as fast as if I hack f to:

g :: Foo -> [Foo] -> [Foo]
g c xs = let c' = fft c
         in map (c'*) xs

The function g makes the memoization of fft c occur, and is much faster than calling f (no matter how I define (*)). I don't understand what is going wrong with f. Is it my definition of (*) in the Num instance? Does it have something to do with f working over all Nums, and GHC therefore being unable to figure out how to partially compute (*)?

Note: I checked the core output for my Num instance, and (*) is indeed represented as nested lambdas with the FFT conversion in the top level lambda. So it looks like this is at least capable of being memoized. I have also tried both judicious and reckless use of bang patterns to attempt to force evaluation to no effect.

As a side note, even if I can figure out how to make (*) memoize its first argument, there is still another problem with how it is defined: A programmer wanting to use the Foo data type has to know about this memoization capability. If she wrote

map (*c) xs

no memoization would occur. (It must be written as (map (c*) xs)) Now that I think about it, I'm not entirely sure how GHC would rewrite the (*c) version since I have curried (*). But I did a quick test to verify that both (*c) and (c*) work as expected: (c*) makes c the first arg to *, while (*c) makes c the second arg to *. So the problem is that it is not obvious how one should write the multiplication to ensure memoization. Is this just an inherent downside to the infix notation (and the implicit assumption that the arguments to * are symmetric)?

The second, less pressing issue is that the case where we map (v*) onto a list of scalars. In this case, (hopefully) the fft of v would be computed and stored, even though it is unnecessary since the other multiplicand is a scalar. Is there any way around this?

Thanks

like image 344
crockeea Avatar asked Feb 26 '13 02:02

crockeea


People also ask

What is a Memoized value?

In computing, memoization or memoisation is an optimization technique used primarily to speed up computer programs by storing the results of expensive function calls and returning the cached result when the same inputs occur again.

Why is it called memoization and not memorization?

The term "memoization" was introduced by Donald Michie in the year 1968. It's based on the Latin word memorandum, meaning "to be remembered". It's not a misspelling of the word memorization, though in a way it has something in common. Memoisation is a technique used in computing to speed up programs.

Why is memoization faster?

In programming, memoization is an optimization technique that makes applications more efficient and hence faster. It does this by storing computation results in cache, and retrieving that same information from the cache the next time it's needed instead of computing it again.

Is memoization the same as caching?

Memoization is a specific form of caching that involves caching the return value of a function based on its parameters. Caching is a more general term; for example, HTTP caching is caching but not memoization.


2 Answers

I believe stable-memo package could solve your problem. It memoizes values not using equality but by reference identity:

Whereas most memo combinators memoize based on equality, stable-memo does it based on whether the exact same argument has been passed to the function before (that is, is the same argument in memory).

And it automatically drops memoized values when their keys are garbage collected:

stable-memo doesn't retain the keys it has seen so far, which allows them to be garbage collected if they will no longer be used. Finalizers are put in place to remove the corresponding entries from the memo table if this happens.

So if you define something like

fft = memo fft'
  where fft' = ... -- your old definition

you'll get pretty much what you need: Calling map (c *) xs will memoize the computation of fft inside the first call to (*) and it gets reused on subsequent calls to (c *). And if c is garbage collected, so is fft' c.

See also this answer to How to add fields that only cache something to ADT?

like image 105
Petr Avatar answered Oct 20 '22 19:10

Petr


I can see two problems that might prevent memoization:

First, f has an overloaded type and works for all Num instances. So f cannot use memoization unless it is either specialized (which usually requires a SPECIALIZE pragma) or inlined (which may happen automatically, but is more reliable with an INLINE pragma).

Second, the definition of (*) for Foo performs pattern matching on the first argument, but f multiplies with an unknown c. So within f, even if specialized, no memoization can occur. Once again, it very much depends on f being inlined, and a concrete argument for c to be supplied, so that inlining can actually appear.

So I think it'd help to see how exactly you're calling f. Note that if f is defined using two arguments, it has to be given two arguments, otherwise it cannot be inlined. It would furthermore help to see the actual definition of Foo, as the one you are giving mentions c and v which aren't in scope.

like image 40
kosmikus Avatar answered Oct 20 '22 20:10

kosmikus