Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

looking for a split-on function

Tags:

clojure

I'm looking for a function with the following behavior

(split-on "" ("" "test" "one" "" "two"))
(() ("test" "one") ("two"))

I can't find it in 'core', and I'm not sure how to look it up. Suggestions?

Edit: split-when looks promising, but I think I am using it wrong.

(t/split-when #(= "" %) '("" "test" "one" "" "two"))
[["" "test" "one" "" "two"] ()]

whereas I am looking for the return value of [[] ["test" "one"] ["two"]]


2 Answers

partition-by is close. You can partition the sequence by members that are equal fo the split token:

(partition-by #(= "" %) '("" "test" "one" "" "two"))
(("") ("test" "one") ("") ("two"))

This leaves extra seperators in there, though that's easy enough to remove:

(remove #(= '("") %) 
       (partition-by empty? ["" "test" "one" "" "two"]))
(("test" "one") ("two"))

If you want to get fancy about it and make a transducer out of that, you can define one like so:

(def split-on
  (comp
   (partition-by #(= "" %))
   (remove #(= '("") %))))

(into [] split-on ["" "test" "one" "" "two"])
[["test" "one"] ["two"]]

This does it on "one pass" without building intermediate structures.

To make that into a normal function (if you don't want a transducer):

(defn split-on [coll]
  (into [] (comp
            (partition-by #(= "" %))
            (remove #(= '("") %)))
        coll))
like image 142
Arthur Ulfeldt Avatar answered Mar 07 '26 00:03

Arthur Ulfeldt


I was looking for exactly this function recently and had to create it myself. It is available in the Tupelo library. You can see the API docs here: http://cloojure.github.io/doc/tupelo/tupelo.core.html#var-split-when

(split-when pred coll)
  Splits a collection based on a predicate with a collection 
  argument. Finds the first index N such that (pred (drop N coll)) 
  is true. Returns a length-2 vector of [ (take N coll) (drop N coll) ]. 
  If pred is never satisified, [ coll [] ] is returned.

The unit tests show the function in action (admittedly boring test data):

(deftest t-split-when
  (is= [ [] [0   1   2   3   4]    ] (split-when #(= 0 (first %)) (range 5)))
  (is= [    [0] [1   2   3   4]    ] (split-when #(= 1 (first %)) (range 5)))
  (is= [    [0   1] [2   3   4]    ] (split-when #(= 2 (first %)) (range 5)))
  (is= [    [0   1   2] [3   4]    ] (split-when #(= 3 (first %)) (range 5)))
  (is= [    [0   1   2   3] [4]    ] (split-when #(= 4 (first %)) (range 5)))
  (is= [    [0   1   2   3   4] [] ] (split-when #(= 5 (first %)) (range 5)))
  (is= [    [0   1   2   3   4] [] ] (split-when #(= 9 (first %)) (range 5)))

You can also read the source if you are interested.

like image 30
Alan Thompson Avatar answered Mar 06 '26 23:03

Alan Thompson



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!