Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Lazy "n choose k" in OCaml

As part of a bigger problem of enumerating a set, I need to write an OCaml function 'choose' which takes a list and outputs as the list of all possible sequences of size k made up of elements of that list (without repeating sequences which can be obtained from each other by permutation). The order they are put in the end list is not relevant.

For example,

choose 2 [1;2;3;4] = [[1;2];[1;3];[1;4];[2;3];[2;4];[3;4]]

Any ideas?

I would like to have the whole thing to be lazy, outputting a lazy list, but if you have a strict solution, that'll be very useful too.

like image 504
Surikator Avatar asked Oct 19 '10 14:10

Surikator


3 Answers

Plugging in again with a Haskell solution (it's just easier to work with lazy lists since they are built-in):

combinations 0 _ = [[]]
combinations k [] = []
combinations k (x:xs) = map (x:) (combinations (k-1) xs) ++ combinations k xs

The first two cases follow from the properties of binomial coefficients and more specifically: n choose 0 = 1 for all n including n=0 (that's why it is first to handle the case 0 choose 0). The other one is 0 choose k = 0. The third equation is exact translation of the recursive definition of combinations.

Unfortunately when you apply it to an infinite list it returns a trivial solution:

> take 10 $ combinations 3 [1..]
[[1,2,3],[1,2,4],[1,2,5],[1,2,6],[1,2,7],[1,2,8],[1,2,9],[1,2,10],[1,2,11],[1,2,12]]

EDIT: OK, so we really want to go trough each combination in a finite number of steps. With the above version we are obviously using only the expression to the left of ++ which generates only combinations starting with 1. We can work around this problem by defining an interesting list zipping function which builds a list by alternately picking the head of each of its argument lists (it's important to be non-strict in the second argument):

merge [] ys = ys
merge (x:xs) ys = x:merge ys xs

and use it instead of ++:

combinations k (x:xs) = map (x:) (combinations (k-1) xs) `merge` combinations k xs

lets see:

> let comb_10_3 = combinations 3 [1..10]
> let comb_inf_3 = combinations 3 [1..]
> take 10 comb_inf_3 
[[1,2,3],[2,3,4],[1,3,4],[3,4,5],[1,2,4],[2,4,5],[1,4,5],[4,5,6],[1,2,5],[2,3,5]]
> comb_10_3 `intersect` comb_inf_3 == comb_10_3 
True
> last $ combinations 3 [1..10]
[6,8,10]
> elemIndex [6,8,10] $ combinations 3 [1..]
Just 351

All 10 choose 3 combinations are there!

like image 134
Daniel Avatar answered Oct 06 '22 15:10

Daniel


Just for the sake of completeness, I am putting here the final code which brings together the strict code from Pascal with my lazy stuff and all other Pascal's useful comments.

The lazy list type is defined, then two auxiliary lazy functions (append and map), and finally the function "choose" that we aim to define.

type 'a node_t =
  | Nil                                            
  | Cons of 'a * 'a t
and 'a t = ('a node_t) Lazy.t

let rec append l1 l2 = 
match Lazy.force l1 with
    | Nil -> l2 
    | Cons (a, l) -> lazy (Cons (a, append l l2))

let rec map f ll = lazy (
match Lazy.force ll with
    | Nil ->    Nil
    | Cons(h,t) -> Cons(f h, map f t) )

let rec choose k l len =
  if k = 0 
  then lazy (Cons(lazy Nil,lazy Nil))
  else
        if len < k
        then lazy Nil
        else if k = len
    then lazy (Cons (l,lazy Nil))
    else
      match Lazy.force l with
          | Cons(h,t) ->  let g h sublist = lazy (Cons (h,sublist))
                          in let starting_with_h = (map (g h) (choose (k-1) t (len-1)))
                          in let not_starting_with_h = choose k t (len-1)
                          in append starting_with_h not_starting_with_h
          | Nil -> assert false

The result of evaluating "choose k ls n" is a lazy list of all choices of k elements of list ls, with ls considered up to size n. Note that, as pointed out by Pascal, because of the way the enumeration takes place, the function choose will not cover all choices of an infinite list.

Thanks, this was really useful!

Best, Surikator.

like image 21
Surikator Avatar answered Oct 06 '22 15:10

Surikator


Here is a strict and suboptimal version. I hope it is clear. It avoids duplicates by assuming there are no duplicates in the input list, and by generating only sublists that are in the same order as in the original list.

The length computation could be factored by passing l's length as an argument of choose. That would make the code less readable but more efficient.

For the lazy version, sprinkle "lazy" and "Lazy.force" on the code...

let rec choose k l =
  if k = 0 
  then [ [] ]
  else
    let len = List.length l in
    if len < k
    then []
    else if k = len
    then [ l ]
    else
      match l with
      h :: t ->
          let starting_with_h =
            (List.map (fun sublist -> h :: sublist) (choose (pred k) t))
          in
          let not_starting_with_h = choose k t in
          starting_with_h @ not_starting_with_h
      | [] -> assert false
;;
  val choose : int -> 'a list -> 'a list list = <fun>

# choose 3 [1; 2; 3; 4; 5; 6; 7] ;;                        
- : int list list =
[[1; 2; 3]; [1; 2; 4]; [1; 2; 5]; [1; 2; 6]; [1; 2; 7]; [1; 3; 4]; [1; 3; 5];
 [1; 3; 6]; [1; 3; 7]; [1; 4; 5]; [1; 4; 6]; [1; 4; 7]; [1; 5; 6]; [1; 5; 7];
 [1; 6; 7]; [2; 3; 4]; [2; 3; 5]; [2; 3; 6]; [2; 3; 7]; [2; 4; 5]; [2; 4; 6];
 [2; 4; 7]; [2; 5; 6]; [2; 5; 7]; [2; 6; 7]; [3; 4; 5]; [3; 4; 6]; [3; 4; 7];
 [3; 5; 6]; [3; 5; 7]; [3; 6; 7]; [4; 5; 6]; [4; 5; 7]; [4; 6; 7]; [5; 6; 7]]

EDIT:

A lazy_list_append as appears necessary from the comments below:

type 'a node_t =             
      | Empty
      | Node of 'a * 'a zlist_t
and 'a zlist_t = 'a node_t lazy_t

let rec lazy_list_append l1 l2 =
  lazy 
    (match Lazy.force l1 with
      Empty -> Lazy.force l2 
    | Node (h, lt) ->
    Node (h, lazy_list_append lt l2))
;;
like image 42
Pascal Cuoq Avatar answered Oct 06 '22 15:10

Pascal Cuoq