Partitioning in clojure with a lazy collection of strings

Question

Starting with a collection of strings like:

(def str-coll ["abcd" "efgh" "jklm"])

The goal is to extract off a specific number of characters from the head of the string collection, generating a partitioned grouping of strings. This is the desired behavior:

(use '[clojure.contrib.str-utils2 :only (join)])
(partition-all 3 (join "" str-coll))

((\a \b \c) (\d \e \f) (\g \h \j) (\k \l \m))

However, using join forces evaluation of the entire collection, which causes memory issues when dealing with very large collections of strings. My specific use case is generating subsets of strings from a lazy collection generated by parsing a large file of delimited records:

(defn file-coll [in-file]
  (->> (line-seq (reader in-file))
    (partition-by #(.startsWith ^String % ">"))
    (partition 2))))

and is building on work from this previous question. I've tried combinations of reduce, partition and join but can't come up with the right incantation to pull characters from the head of the first string and lazily evaluate subsequent strings as needed. Thanks much for any ideas or pointers.

Alex Taggart · Accepted Answer

Not quite sure what you're going for, but the following does what your first example does, and does so lazily.

Step-by-step for clarity:

user=> (def str-coll ["abcd" "efgh" "jklm"])
#'user/str-coll
user=> (map seq str-coll)
((\a \b \c \d) (\e \f \g \h) (\j \k \l \m))
user=> (flatten *1)
(\a \b \c \d \e \f \g \h \j \k \l \m)
user=> (partition 3 *1)
((\a \b \c) (\d \e \f) (\g \h \j) (\k \l \m))

All together now:

(->> str-coll 
  (map seq)
  flatten
  (partition 3))

Partitioning in clojure with a lazy collection of strings

Tags:

clojure

lazy-evaluation

Brad Chapman

1 Answers

Alex Taggart

Recent Activity

Donate For Us

Partitioning in clojure with a lazy collection of strings

Tags:

clojure

lazy-evaluation

Brad Chapman

1 Answers

Alex Taggart

Related questions

Recent Activity

Donate For Us