Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Partition a seq by a "windowing" predicate in Clojure

Tags:

clojure

I would like to "chunk" a seq into subseqs the same as partition-by, except that the function is not applied to each individual element, but to a range of elements.

So, for example:

(gather (fn [a b] (> (- b a) 2)) 
        [1 4 5 8 9 10 15 20 21])

would result in:

[[1] [4 5] [8 9 10] [15] [20 21]]

Likewise:

(defn f [a b] (> (- b a) 2))
(gather f [1 2 3 4]) ;; => [[1 2 3] [4]]
(gather f [1 2 3 4 5 6 7 8 9]) ;; => [[1 2 3] [4 5 6] [7 8 9]]

The idea is that I apply the start of the list and the next element to the function, and if the function returns true we partition the current head of the list up to that point into a new partition.

I've written this:

(defn gather
  [pred? lst]
  (loop [acc [] cur [] l lst]
    (let [a (first cur)
          b (first l)
          nxt (conj cur b)
          rst (rest l)]
      (cond
       (empty? l) (conj acc cur)
       (empty? cur) (recur acc nxt rst)
       ((complement pred?) a b) (recur acc nxt rst)
       :else (recur (conj acc cur) [b] rst)))))

and it works, but I know there's a simpler way. My question is:

Is there a built in function to do this where this function would be unnecessary? If not, is there a more idiomatic (or simpler) solution that I have overlooked? Something combining reduce and take-while?

Thanks.

like image 838
Scott Klarenbach Avatar asked Apr 21 '14 23:04

Scott Klarenbach


3 Answers

Original interpretation of question

We (all) seemed to have misinterpreted your question as wanting to start a new partition whenever the predicate held for consecutive elements.

Yet another, lazy, built on partition-by

(defn partition-between [pred? coll] 
  (let [switch (reductions not= true (map pred? coll (rest coll)))] 
    (map (partial map first) (partition-by second (map list coll switch)))))

(partition-between (fn [a b] (> (- b a) 2)) [1 4 5 8 9 10 15 20 21])
;=> ((1) (4 5) (8 9 10) (15) (20 21))

Actual Question

The actual question asks us to start a new partition whenever pred? holds for the beginning of the current partition and the current element. For this we can just rip off partition-by with a few tweaks to its source.

(defn gather [pred? coll]
  (lazy-seq
   (when-let [s (seq coll)]
     (let [fst (first s)
           run (cons fst (take-while #((complement pred?) fst %) (next s)))]
       (cons run (gather pred? (seq (drop (count run) s))))))))

(gather (fn [a b] (> (- b a) 2)) [1 4 5 8 9 10 15 20 21])
;=> ((1) (4 5) (8 9 10) (15) (20 21))

(gather (fn [a b] (> (- b a) 2)) [1 2 3 4])
;=> ((1 2 3) (4))

(gather (fn [a b] (> (- b a) 2)) [1 2 3 4 5 6 7 8 9])
;=> ((1 2 3) (4 5 6) (7 8 9))
like image 67
A. Webb Avatar answered Nov 20 '22 10:11

A. Webb


Since you need to have the information from previous or next elements than the one you are currently deciding on, a partition of pairs with a reduce could do the trick in this case.

This is what I came up with after some iterations:

(defn gather [pred s]
  (->> (partition 2 1 (repeat nil) s) ; partition the sequence and if necessary
                                      ; fill the last partition with nils
    (reduce (fn [acc [x :as s]]
              (let [n   (dec (count acc))
                    acc (update-in acc [n] conj x)]
                (if (apply pred s)
                  (conj acc [])
                  acc)))
            [[]])))

(gather (fn [a b] (when (and a b) (> (- b a) 2)))
        [1 4 5 8 9 10 15 20 21])

;= [[1] [4 5] [8 9 10] [15] [20 21]]

The basic idea is to make partitions of the number of elements the predicate function takes, filling the last partition with nils if necessary. The function then reduces each partition by determining if the predicate is met, if so then the first element in the partition is added to the current group and a new group is created. Since the last partition could have been filled with nulls, the predicate has to be modified.

Tow possible improvements to this function would be to let the user:

  1. Define the value to fill the last partition, so the reducing function could check if any of the elements in the partition is this value.
  2. Specify the arity of the predicate, thus allowing to determine the grouping taking into account the current and the next n elements.
like image 42
juan.facorro Avatar answered Nov 20 '22 11:11

juan.facorro


I wrote this some time ago in useful:

(defn partition-between [split? coll]
  (lazy-seq
   (when-let [[x & more] (seq coll)]
     (lazy-loop [items [x], coll more]
       (if-let [[x & more] (seq coll)]
         (if (split? [(peek items) x])
           (cons items (lazy-recur [x] more))
           (lazy-recur (conj items x) more))
         [items])))))

It uses lazy-loop, which is just a way to write lazy-seq expressions that look like loop/recur, but I hope it's fairly clear.

I linked to a historical version of the function, because later I realized there's a more general function that you can use to implement partition-between, or partition-by, or indeed lots of other sequential functions. These days the implementation is much shorter, but it's less obvious what's going on if you're not familiar with the more general function I called glue:

(defn partition-between [split? coll]
  (glue conj []
        (fn [v x]
          (not (split? [(peek v) x])))
        (constantly false)
        coll))

Note that both of these solutions are lazy, which at the time I'm writing this is not true of any of the other solutions in this thread.

like image 1
amalloy Avatar answered Nov 20 '22 10:11

amalloy