Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Directly generating specific subsets of a powerset?

Haskell's expressiveness enables us to rather easily define a powerset function:

import Control.Monad (filterM)

powerset :: [a] -> [[a]]
powerset = filterM (const [True, False])

To be able to perform my task it is crucial for said powerset to be sorted by a specific function, so my implementation kind of looks like this:

import Data.List (sortBy)
import Data.Ord (comparing)

powersetBy :: Ord b => ([a] -> b) -> [a] -> [[a]]
powersetBy f = sortBy (comparing f) . powerset

Now my question is whether there is a way to only generate a subset of the powerset given a specific start and endpoint, where f(start) < f(end) and |start| < |end|. For example, my parameter is a list of integers ([1,2,3,4,5]) and they are sorted by their sum. Now I want to extract only the subsets in a given range, lets say 3 to 7. One way to achieve this would be to filter the powerset to only include my range but this seems (and is) ineffective when dealing with larger subsets:

badFunction :: Ord b => b -> b -> ([a] -> b) -> [a] -> [[a]]
badFunction start end f = filter (\x -> f x >= start && f x <= end) . powersetBy f

badFunction 3 7 sum [1,2,3,4,5] produces [[1,2],[3],[1,3],[4],[1,4],[2,3],[5],[1,2,3],[1,5],[2,4],[1,2,4],[2,5],[3,4]].

Now my question is whether there is a way to generate this list directly, without having to generate all 2^n subsets first, since it will improve performance drastically by not having to check all elements but rather generating them "on the fly".

like image 631
ThreeFx Avatar asked Apr 03 '15 19:04

ThreeFx


People also ask

What is the power set of all subsets of s?

Power Set Power set P (S) of a set S is the set of all subsets of S. For example S = {a, b, c} then P (s) = { {}, {a}, {b}, {c}, {a,b}, {a, c}, {b, c}, {a, b, c}}. Recommended: Please solve it on “ PRACTICE ” first, before moving on to the solution. Attention reader! Don’t stop learning now.

Where P (A) denotes the powerset?

Where P (A) denotes the powerset. Let us understand the concept with the help of examples and properties. In set theory, the power set (or powerset) of a Set A is defined as the set of all subsets of the Set A including the Set itself and the null or empty set. It is denoted by P (A).

What is a power set?

The definition seems a little confusing but in real, Power sets are very easily understood. Imagine a set with some elements present in them, now write all the possible subsets that can be written for the particular set, treat the subsets as elements as place them in a separate set, this set obtained will be called as a Power set.

What is the powerset of an empty set?

An empty set has zero elements. Therefore, powerset of an empty set { }, can be mentioned as; A set containing a null set. It contains zero or null elements. The empty set is the only subset.


2 Answers

If you want to allow for completely general ordering-functions, then there can't be a way around checking all elements of the powerset. (After all, how would you know the isn't a special clause built in that gives, say, the particular set [6,8,34,42] a completely different ranking from its neighbours?)

However, you could make the algorithm already drastically faster by

  1. Only sorting after filtering: sorting is O (n · log n), so you want keep n low here; for the O (n) filtering step it matters less. (And anyway, number of elements doesn't change through sorting.)
  2. Apply the ordering-function only once to each subset.

So

import Control.Arrow ((&&&))

lessBadFunction :: Ord b => (b,b) -> ([a]->b) -> [a] -> [[a]]
lessBadFunction (start,end) f
     = map snd . sortBy (comparing fst)
               . filter (\(k,_) -> k>=start && k<=end)
               . map (f &&& id)
               . powerset

Basically, let's face it, powersets of anything but a very small basis are infeasible. The particular application “sum in a certain range” is pretty much a packaging problem; there are quite efficient ways to do that kind of thing, but you'll have to give up the idea of perfect generality and of quantification over general subsets.

like image 103
leftaroundabout Avatar answered Sep 27 '22 19:09

leftaroundabout


Since your problem is essentially a constraint satisfaction problem, using an external SMT solver might be the better alternative here; assuming you can afford the extra IO in the type and the need for such a solver to be installed. The SBV library allows construction of such problems. Here's one encoding:

import Data.SBV

-- c is the cost type
-- e is the element type
pick :: (Num e, SymWord e, SymWord c) => c -> c -> ([SBV e] -> SBV c) -> [e] -> IO [[e]]
pick begin end cost xs = do
        solutions <- allSat constraints
        return $ map extract $ extractModels solutions
  where extract ts  = [x | (t, x) <- zip ts xs, t]
        constraints = do tags <- mapM (const free_) xs
                         let tagged    = zip tags xs
                             finalCost = cost [ite t (literal x) 0 | (t, x) <- tagged]    
                         solve [finalCost .>= literal begin, finalCost .<= literal end]

test :: IO [[Integer]]
test = pick 3 7 sum [1,2,3,4,5]

We get:

Main> test
[[1,2],[1,3],[1,2,3],[1,4],[1,2,4],[1,5],[2,5],[2,3],[2,4],[3,4],[3],[4],[5]]

For large lists, this technique will beat out generating all subsets and filtering; assuming the cost function generates reasonable constraints. (Addition will be typically OK, if you've multiplications, the backend solver will have a harder time.)

(As a side note, you should never use filterM (const [True, False]) to generate power-sets to start with! While that expression is cute and fun, it is extremely inefficient!)

like image 41
alias Avatar answered Sep 27 '22 20:09

alias