I always thought that Haskell would do some sort of automatic intelligent memoizing. E.g., the naive Fibonacci implementation
fib 0 = 0 fib 1 = 1 fib n = fib (n-2) + fib (n-1)
would be fast because of that. Now I read this and it seems I was wrong -- Haskell doesn't seem to do automatic memoization. Or do I understand something wrong?
Are there other languages which do automatic (i.e. implicit, not explicit) memoization?
What are common ways to implement memoization? In all sample implementations I have seen, they use a hashmap but there isn't any limit in its size. Obviously, this wouldn't work in practice because you need some sort of limit. And given that, it becomes more complicated because when you hit the limit, you have to throw some data away. And there it becomes complicated: Should the limit maybe be dynamically and often used functions should have a higher limit than less often used functions? And what entry do you throw away when you hit the limit? Just the latest used one? In that case, you need to have your data also sorted in addition. You could use some combination of a linked list and a hash map to achieve that. Is that the common way?
Could you maybe link (or refer) to some common real-world implementations?
Thanks, Albert
Edit: I am mostly interested in that problem I described, i.e. how to implement such a limit. Any references to any papers which address this would be very nice.
Edit: Some own thoughts with a sample implementation (which has a limit) can be found here.
Edit: I am not trying to solve a specific problem in a specific application. I am searching for general solutions for memoization which could be applied globally to all functions of a (pure functional) program (thus algorithms which don't implement a memory limit are not a solution). Of course there (probably) is no optimal/best solution. But this makes my question not less interesting.
For trying out such solution, I thought about adding it to Haskell as an optimization. I really wonder how well that would perform.
And I wonder if someone has done that already.
In computing, memoization or memoisation is an optimization technique used primarily to speed up computer programs by storing the results of expensive function calls and returning the cached result when the same inputs occur again.
JavaScript Memoization Example The Fibonacci sequence is a set of numbers that starts with a one or a zero, followed by a one, and proceeds based on the rule that each number (called a Fibonacci number) is equal to the sum of the preceding two numbers. It looks like this: 0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, …
Memoization is a technique for improving the performance of recursive algorithms. It involves rewriting the recursive algorithm so that as answers to problems are found, they are stored in an array. Recursive calls can look up results in the array rather than having to recalculate them.
Programming Languages that support functional programming: Haskell, JavaScript, Python, Scala, Erlang, Lisp, ML, Clojure, OCaml, Common Lisp, Racket.
I said in a comment that your requirements sounded like garbage collection. I thought this because you are interested in managing a limited pool of memory, purging it once in a while so that it doesn't go over.
Now that I think about it, it's more like a virtual memory page replacement algorithm. You can read that Wikipedia page for various approaches used to solve that kind of problem, such as "not recently used", "aging", "clock", "second chance", etc.
However, memoization isn't often done by restricting the results retained; the required mutation for the above-mentioned algorithms is generally un-haskellish. Don't let that discourage you, though. You have interesting ideas that could be valuable additions to the exploration of memoization possibilities in Haskell performed thusfar.
Sometimes a particular memoization problem lends itself well to limited memory. For example, aligning two gene sequences can be done with Dynamic Programming (see Wikipedia's Dynamic Programming # Sequence alignment) with a 2-dimensional memoization table. But since the DP solution for a given cell only depends on the results from the preceding row, you can start from the bottom and discard rows that are more than 1 away from the current row. Fibonacci numbers are the same: you only need the previous two numbers in the sequence in order to compute the next. You can discard anything earlier if all you are interested in is the nth number.
Most memoizations are for the sake of speeding up recursive algorithms where there are shared subproblems. Many such problems do not have an easy way of sequencing the evaluations in order to throw away the results you no longer need. At that point, you're just guessing, using heuristics (like frequency of use) to determine who gets how much access to the limited resource.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With