I am writing an OCaml function where I need to merge two maps. I have not been able to figure out the semantics of the merge
function provided by functor Map.Make
(found in version 3.12.0 of OCaml). Could someone provide me with a more detailed explanation than the OCaml manual ? An example would probably enough for me to figure it out.
Additionally, the two maps that I need to merge have some interesting properties: the keys have the same type (int
, actually), and their domain is disjoint. Would there be a more efficient approach than the merge routine ?
merge
takes a function and two maps. For each key present in either map, the function will be called. If the key is only present in one of the maps, the value for the other one will be passed in as None (which is why the arguments are options). If the functions returns Some x
, the new map will have the value x
for the key in question. Otherwise the key won't be present.
Example:
let map1 = add 1 2 (add 2 3 empty);;
let map2 = add 2 5 (add 3 4 empty);;
let map3 = merge (fun k xo yo -> match xo,yo with
| Some x, Some y -> Some (x+y)
| _ -> None
) map1 map2;;
map3 now contains the mapping 2 -> 8
.
If you change it to:
let map3 = merge (fun k xo yo -> match xo,yo with
| Some x, Some y -> Some (x+y)
| None, yo -> yo
| xo, None -> xo
) map1 map2;;
It will contain the mappings 1 -> 2
, 2 -> 8
and 3 -> 4
.
Since your maps are disjoint then you can just loop through the smaller map and insert into the larger:
let disjoint_merge m1 m2 =
if (IntMap.cardinal m1) < (IntMap.cardinal m2) then
IntMap.fold IntMap.add m1 m2
else
IntMap.fold IntMap.add m2 m1
In the event that you are using an older version of OCaml that doesn't include the cardinal
function you can just choose one map to always iterate through. As for what the above code does, is it uses:
val fold : (key -> 'a -> 'b -> 'b) -> 'a t -> 'b -> 'b
Which basically iterates through all the elements a map, and calls a function that takes a key, value and some other variable of type 'b
and returns something of that type ('b
). In our case we are passing the function IntMap.add
and that other variable is our second map. So it iterates through all of the elements and adds them to the other map.
EDIT: You are better off just doing:
let disjoint_merge m1 m2 =
IntMap.fold IntMap.add m1 m2
EDIT2: even better:
let disjoint_merge = IntMap.fold IntMap.add;;
I just looked at the implementation of cardinal
and it walks the tree and returns the count. So by checking sizes you are doing O(n + m + min(n,m)log(max(n,m))) instead of just O(nlog(m)).
Based on the first answer, and considering the additional question (merge maps in the case of disjoint domains), I would then propose the following generic merge routine:
let merge_disjoint m1 m2 =
IntMap.merge
(fun k x0 y0 ->
match x0, y0 with
None, None -> None
| None, Some v | Some v, None -> Some v
| _, _ -> invalid_arg "merge_disjoint: maps are not disjoint")
m1 m2
Would there be some more efficient way to do this?
-- David.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With