Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Extending Immutable types (or: fast cache for immutable types) in OCaml

I have a recursive immutable data structure in ocaml which can be simplified to something like this:

type expr =
{
    eexpr : expr_expr;
    some_other_complex_field : a_complex_type;
}

and expr_expr =
    | TInt of int
    | TSum of (expr * expr)
    | TMul of (expr * expr)

It's an AST, and sometimes it gets pretty complex (it's very deep).

there is a recursive function that evaluates an expression. For example, let's say,

let rec result expr =
    match expr.eexpr with
        | TInt i -> i
        | TSum (e1, e2) -> result e1 + result e2
        | TMul (e1, e2) -> result e1 * result e2

Now suppose I am mapping an expression to another expression, and I need to constantly check the result of an expr, sometimes more than once for the same expr, and sometimes for expressions that were recently mapped by using the pattern

{ someExpr with eexpr = TSum(someExpr, otherExpr) }

Now, the result function is very lightweight, but running it many times for a deep AST will not be very optimized. I know I could cache the value using a Hashtbl, but AFAIK the Hashtbl will only do structural equality, so it will need to traverse my long AST anyway. I know the best option would be to include a probably immutable "result" field in the expr type. But I can't.

So is there any way in Ocaml to cache a value to an immutable type, so I don't have to calculate it eagerly every time I need it ?

Thanks!

like image 702
Waneck Avatar asked Feb 03 '12 01:02

Waneck


People also ask

What is cache-control immutable?

What is Cache-Control: immutable? #. The Cache-Control: immutable directive was first introduced in Firefox 49 to provide the browser with hints as to which resources never change.

What are compound data types in OCaml?

We have already seen simple data types such as int, float, string, and bool. Let's recap the built-in compound data types we can use in OCaml to combine such values. First, we have lists which are ordered collections of any number of elements of like type: Next, we have tuples, which collect a fixed number of elements together:

Should I add the immutable directive to my Cache-Control header?

All browsers that do not support the immutable directive simply ignore it, therefore it is safe to add and won't cause any conflicts. If you have static assets that you know aren't going to change for an extended period of time it may be worthwhile adding the immutable directive to your Cache-Control response header.

What is the difference between always-matching _ and tuple in OCaml?

There are two ways this matters: the memory layout differs between the two (a tuple is an extra indirection), and the ability to create or match using a tuple: Note, however, that OCaml allows us to use the always-matching _ in either version:


2 Answers

Hash-cons the values of expr_expr. By doing this structurally equal values in your program will share exactly the same memory representation and you can substitute structural equality (=) by physical equality (==).

This paper should get you quickly started on hash-consing in OCaml.

like image 94
Daniel Bünzli Avatar answered Oct 05 '22 00:10

Daniel Bünzli


You can use the functorial interface to control the kind of equality used by the hash table. I believe the semantics of (==) are legitimate for your purposes; i.e., if A == B then f A = f B for any pure function f. So you can cache the results of f A. Then if you find a B that's physically equal to A, the cached value is correct for B.

The downside of using (==) for hashing is that the hash function will send all structurally equal objects to the same hash bucket, where they will be treated as distinct objects. If you have a lot of structurally equal objects in the table, you get no benefit from the hashing. The behavior degenerates to a linear search.

You can't define the hash function to work with physical addresses, because the physical addresses can be changed at any time by the garbage collector.

However, if you know your table will only contain relatively few large-ish values, using physical equality might work for you.

like image 30
Jeffrey Scofield Avatar answered Oct 05 '22 01:10

Jeffrey Scofield