Let's say I have sets A_1,...A_n,
e.g. [[a b c][d e][f]]
. I would like to find Cartesian product of these sets but not including any terms which are supersets of elements of some ignore list.
For instance if my ignore list is [[a e][c]]
, the result of the Cartesian product would be [[a d f][b d f][b e f]]
. Note any term with c
is not in there, neither is [a e f]
.
Of course one way I could do this is to find the full cartesian product and then remove the offending items, but I would like a more efficient way, such that I avoid checking solutions in the first place.
I have an initial solution which involves incrementally building each term in the cart-product, and at each stage I remove any elements from A_i
if adding them to the term I am building would cause it to be a superset of any one of the ignores.
This works fine, and is better than the naive solution, but there is still a large amount of redundant checking, which also depeneds on the order in which the sets are presented. E.g. if [f]
was in my ignore list, I would still keep trying to create terms until I reach [f]
and then discard.
For concreteness, my clojure implementation is
(defn first-elements
"Get the first elements of a set of sets, unless ignored"
[sets ignores in-ignore?]
(loop [product-tuple [] sets sets]
(println "sets " sets)
(cond
(or (nil? sets) (nil? (first sets)))
product-tuple
:else
(if-let [set-op (remove #(in-ignore? product-tuple ignores %) (first sets))]
(if (and (coll? set-op) (empty? set-op))
product-tuple
(recur (conj product-tuple (first set-op)) (next sets)))
product-tuple))))
(defn in-ignore?
"if I add elem to this build will it become a superset of any of the ignores"
[build ignores elem]
(some #(clojure.set/superset? (conj (set build) elem) %) ignores))
(defn cartesian-product-ignore
"All the ways to take one item from each sequence, except for ignore"
[ignores original-sets]
(loop [cart-prod #{} sets original-sets]
(let [firsts (first-elements sets ignores in-ignore?)]
(print "firsts " firsts "-cart-prod " cart-prod " sets " sets "\n")
(cond
(zero? (count firsts))
cart-prod
(= (count sets) (count firsts))
(recur (conj cart-prod firsts) (update-in sets [(dec (count sets))] next))
:else
(recur cart-prod (assoc
(update-in sets [(dec (count firsts))] next)
(count firsts)
(original-sets (count firsts))))))))
I think there are some improvements that can be made over your current approach. But first, let's implement a basic cartisian-product
. Then we can adapt it to accept an ignores list. This is easy enough using for
and some recursion:
(defn cartesian-product [colls]
(if (empty? colls)
(list ())
(for [e (first colls)
sub-product (cartesian-product (rest colls))]
(cons e sub-product))))
;; Quick test run
(cartesian-product [[:a :b :c] [:d :e] [:f]])
=> ((:a :d :f) (:a :e :f) (:b :d :f) (:b :e :f) (:c :d :f) (:c :e :f))
Good. And since we're using for
, we have the advantage of laziness. If you need your result to be something other than a sequence of sequences, it's easy enough to convert it to something else.
Now, the hard part -- implementing the ignore sets. According to your description, your current approach is to remove elements from A_i if adding them to the term you are building would cause that term to become a superset of any of the ignore sets. As your code illustrates, not only is this somewhat inefficient (for example, superset?
is worst-case linear time w.r.t. the size of its first parameter), but it also makes the code more complicated than it needs to be.
So let's adopt a different approach. Instead of removing elements from A_i, let's remove any elements we add to a term from the ignore sets. Then we can prune a term if any of the ignore sets are empty. As a bonus, all it requires is a few changes to our previous cartesian-product
implementation:
(defn cartesian-product-ignore [ignore-sets colls]
(cond (some empty? ignore-sets) () ; prune
(empty? colls) (list ()) ; base case
:else ; recursive case
(for [e (first colls)
sub-product (cartesian-product-ignore (map (fn [s]
(disj s e))
ignore-sets)
(rest colls))]
(cons e sub-product))))
;; test without any ignore sets
(cartesian-product-ignore [] [[:a :b :c] [:d :e] [:f]])
=> ((:a :d :f) (:a :e :f) (:b :d :f) (:b :e :f) (:c :d :f) (:c :e :f))
;; Now the moment of truth
(cartesian-product-ignore [(set [:a :e]) (set [:c])] [[:a :b :c] [:d :e] [:f]])
=> ((:a :d :f) (:b :d :f) (:b :e :f))
Of course, minor changes may be required to fit your exact needs. For example, you might want to accept ignore sets as a vector or sequence and convert them to sets internally. But that is the essence of the algorithm..
Here a core.logic (naive) approach
(ns testing
(:refer-clojure :exclude [==])
(:use [clojure.core.logic])
)
(run* [q]
(fresh [x y z]
(membero x [:a :b :c])
(membero y [:d :e])
(membero z [:f])
(== q [x y z])
(!= q [:a :e z] )
(!= q [:c y z] )
)
)
==> ([:a :d :f] [:b :d :f] [:b :e :f])
Although it's much more slow than @Nathan_Davis algorithm, 23.263 msecs vs 0.109 msecs
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With