Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What's the easiest way to parse numbers in clojure?

Tags:

clojure

I've been using java to parse numbers, e.g.

(. Integer parseInt  numberString)

Is there a more clojuriffic way that would handle both integers and floats, and return clojure numbers? I'm not especially worried about performance here, I just want to process a bunch of white space delimited numbers in a file and do something with them, in the most straightforward way possible.

So a file might have lines like:

5  10  0.0002
4  12  0.003

And I'd like to be able to transform the lines into vectors of numbers.

like image 514
Rob Lachlan Avatar asked Apr 14 '10 18:04

Rob Lachlan


3 Answers

You can use the edn reader to parse numbers. This has the benefit of giving you floats or Bignums when needed, too.

user> (require '[clojure.edn :as edn])
nil
user> (edn/read-string "0.002")
0.0020

If you want one huge vector of numbers, you could cheat and do this:

user> (let [input "5  10  0.002\n4  12  0.003"]
        (read-string (str "[" input "]")))
[5 10 0.0020 4 12 0.0030]

Kind of hacky though. Or there's re-seq:

user> (let [input "5  10  0.002\n4  12  0.003"]
        (map read-string (re-seq #"[\d.]+" input)))
(5 10 0.0020 4 12 0.0030)

Or one vector per line:

user> (let [input "5  10  0.002\n4  12  0.003"]
        (for [line (line-seq (java.io.BufferedReader.
                              (java.io.StringReader. input)))]
             (vec (map read-string (re-seq #"[\d.]+" line)))))
([5 10 0.0020] [4 12 0.0030])

I'm sure there are other ways.

like image 93
Brian Carper Avatar answered Nov 04 '22 23:11

Brian Carper


If you want to be safer, you can use Float/parseFloat

user=> (map #(Float/parseFloat (% 0)) (re-seq #"\d+(\.\d+)?" "1 2.2 3.5"))
(1.0 2.2 3.5)
user=> 
like image 21
lazy1 Avatar answered Nov 04 '22 23:11

lazy1


Not sure if this is "the easiest way", but I thought it was kind of fun, so... With a reflection hack, you can access just the number-reading part of Clojure's Reader:

(let [m (.getDeclaredMethod clojure.lang.LispReader
                            "matchNumber"
                            (into-array [String]))]
  (.setAccessible m true)
  (defn parse-number [s]
    (.invoke m clojure.lang.LispReader (into-array [s]))))

Then use like so:

user> (parse-number "123")
123
user> (parse-number "123.5")
123.5
user> (parse-number "123/2")
123/2
user> (class (parse-number "123"))
java.lang.Integer
user> (class (parse-number "123.5"))
java.lang.Double
user> (class (parse-number "123/2"))
clojure.lang.Ratio
user> (class (parse-number "123123451451245"))
java.lang.Long
user> (class (parse-number "123123451451245123514236146"))
java.math.BigInteger
user> (parse-number "0x12312345145124")
5120577133367588
user> (parse-number "12312345142as36146") ; note the "as" in the middle
nil

Notice how this does not throw the usual NumberFormatException if something goes wrong; you could add a check for nil and throw it yourself if you want.

As for performance, let's have an unscientific microbenchmark (both functions have been "warmed up"; initial runs were slower as usual):

user> (time (dotimes [_ 10000] (parse-number "1234123512435")))
"Elapsed time: 564.58196 msecs"
nil
user> (time (dotimes [_ 10000] (read-string "1234123512435")))
"Elapsed time: 561.425967 msecs"
nil

The obvious disclaimer: clojure.lang.LispReader.matchNumber is a private static method of clojure.lang.LispReader and may be changed or removed at any time.

like image 26
Michał Marczyk Avatar answered Nov 04 '22 23:11

Michał Marczyk