Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fast powerset implementation with complement set

I would like to have a function

powersetWithComplements :: [a] -> [([a], [a])]

Such that for example:

powersetWithComplements [1,2,3] = [([],[1,2,3]),([3],[1,2]),([2],[1,3]),([2,3],[1]),([1],[2,3]),([1,3],[2]),([1,2],[3]),([1,2,3],[])]

It is easy to obtain some implementation, for example

powerset :: [a] -> [[a]]
powerset = filterM (const [False, True])

powersetWithComplements s = let p = powerset s in zip p (reverse p)

Or

powersetWithComplements s = [ (x, s \\ x) | x <- powerset s]

But I estimate that the performance of both these would be really poor. What would be an optimal approach? It is possible to use different data structure than the [] list.

like image 661
user1747134 Avatar asked Dec 11 '17 14:12

user1747134


2 Answers

Well you should see a powerset like this: you enumerate over the items of the set, and you decide whether you put these in the "selection" (first item of the tuple), or not (second item of the tuple). By enumerating over these selections exhaustively, we get the powerset.

So we can do the same, for instance using recursion:

import Control.Arrow(first, second)

powersetWithComplements [] = [([],[])]
powersetWithComplements (x:xs) = map (second (x:)) rec ++ map (first (x:)) rec
    where rec = powersetWithComplements xs

So here the map (second (x:) prepends all the second items of the tuples of the rec with x, and the map (second (x:) does the same for the first item of the tuples of rec. where rec is the recursion on the tail of the items.

Prelude Control.Arrow> powersetWithComplements [1,2,3]
[([],[1,2,3]),([3],[1,2]),([2],[1,3]),([2,3],[1]),([1],[2,3]),([1,3],[2]),([1,2],[3]),([1,2,3],[])]

The advantage of this approach is that we do not generate a complement list for every list we generate: we concurrently build the selection, and complement. Furthermore we can reuse the lists we construct in the recursion, which will reduce the memory footprint.

In both time complexity and memory complexity, the powersetWithComplements function will be equal (note that this is complexity, of course in terms of processing time it will require more time, since we do an extra amount of work) like the powerset function, since prepending a list is usually done in O(1)), and we now build two lists (and a tuple) for every original list.

like image 65
Willem Van Onsem Avatar answered Nov 19 '22 04:11

Willem Van Onsem


Since you are looking for a "fast" implementation, I thought I would share some benchmark experiments I did with Willem's solution.

I thought using a DList instead of a plain list would be a big improvement, since DLists have constant-time append, whereas appending lists is linear in the size of the left argument.

psetDL :: [a] -> [([a],[a])]
psetDL = toList . go
    where
    go [] = DList.singleton ([],[])
    go (x:xs) = (second (x:) <$> rec) <> (first (x:) <$> rec)
        where
        rec = go xs

But that did not have a significant effect.

I suspected this is because we are traversing both sublists anyway because of the fmap (<$>). We can avoid the traversal by doing something similar to CPS-converting the function, passing down the accumulated sets as parameters rather than returning them.

psetTail :: [a] -> [([a],[a])]
psetTail = go [] []
    where
    go a b [] = [(a,b)]
    go a b (x:xs) = go a (x:b) xs <> go (x:a) b xs

This yielded a 220% improvement on a list of size 20. Now since we aren't traversing the lists from fmapping, we can get rid of the append traversal by using a DList:

psetTailDL :: [a] -> [([a],[a])]
psetTailDL = toList . go [] []
    where
    go a b [] = DList.singleton (a,b)
    go a b (x:xs) = go a (x:b) xs <> go (x:a) b xs

Which yields an additional 20% improvement.

like image 41
luqui Avatar answered Nov 19 '22 04:11

luqui