Stuart Halloway gives the example
(re-seq #"\w+" "The quick brown fox")
as the natural method for finding matches of regex matches in Clojure. In his book this construction is contrasted with iteration over a matcher. If all one cared about were a list of matches this would be great. However, what if I wanted matches and their position within the string? Is there a better way of doing this that allows me to leverage the existing functionality in java.util.regex with resorting to something like a sequence comprehension over each index in the original string? In other words, one would like to type something like
(re-seq-map #"[0-9]+" "3a1b2c1d")
which would return a map with keys as the position and values as the matches, e.g.
{0 "3", 2 "1", 4 "2", 6 "1"}
Is there some implementation of this in an extant library already or shall I write it (shouldn't be too may lines of code)?
You can fetch the data you want out of a java.util.regex.Matcher
object.
user> (defn re-pos [re s]
(loop [m (re-matcher re s)
res {}]
(if (.find m)
(recur m (assoc res (.start m) (.group m)))
res)))
#'user/re-pos
user> (re-pos #"\w+" "The quick brown fox")
{16 "fox", 10 "brown", 4 "quick", 0 "The"}
user> (re-pos #"[0-9]+" "3a1b2c1d")
{6 "1", 4 "2", 2 "1", 0 "3"}
You can apply any function to the java.util.regex.Matcher object and return its results (simmilar to Brian's solution, but without explicit loop
):
user=> (defn re-fun
[re s fun]
(let [matcher (re-matcher re s)]
(take-while some? (repeatedly #(if (.find matcher) (fun matcher) nil)))))
#'user/re-fun
user=> (defn fun1 [m] (vector (.start m) (.end m)))
#'user/fun1
user=> (re-fun #"[0-9]+" "3a1b2c1d" fun1)
([0 1] [2 3] [4 5] [6 7])
user=> (defn re-seq-map
[re s]
(into {} (re-fun re s #(vector (.start %) (.group %)))))
user=> (re-seq-map #"[0-9]+" "3a1b2c1d")
{0 "3", 2 "1", 4 "2", 6 "1"}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With