Here's the brief problem:
Input: a list of strings, each containing numbers
(" 3.4 5.4 1.2 6.4" "7.8 5.6 4.3" "1.2 3.2 5.4")
Output: a list of numbers
(3.4 5.4 1.2 6.4 7.8 5.6 4.3 1.2 3.2 5.4)
Here's my attempt at coding this:
(defun parse-string-to-float (line &optional (start 0))
"Parses a list of floats out of a given string"
(if (equalp "" line)
nil
(let ((num (multiple-value-list (read-from-string (subseq line start)))))
(if (null (first num))
nil
(cons (first num) (parse-string-to-float (subseq line (+ start (second num)))))))))
(defvar *data* (list " 3.4 5.4 1.2 6.4" "7.8 5.6 4.3" "1.2 3.2 5.4"))
(setf *data* (format nil "~{~a ~}" *data*))
(print (parse-string-to-float *data*))
===> (3.4 5.4 1.2 6.4 7.8 5.6 4.3 1.2 3.2 5.4)
However, for rather large data sets, it's a slow process. I'm guessing the recursion isn't as tight as possible and I'm doing something unnecessary. Any ideas?
Furthermore, the grand project involves taking an input file that has various data sections separated by keywords. Example -
%FLAG START_COORDS
1 2 5 8 10 12
%FLAG END_COORDS
3 7 3 23 9 26
%FLAG NAMES
ct re ct cg kl ct
etc... I'm trying to parse a hash-table with the keywords that follow %FLAG as the keys, and the values stored as number or string lists depending on the particular keyword I'm parsing. Any ideas for libraries that already do this very type of job, or simple ways around this in lisp?
This is not a task you want to be doing recursively to begin with. Instead, use LOOP
and a COLLECT
clause. For example:
(defun parse-string-to-floats (line)
(loop
:with n := (length line)
:for pos := 0 :then chars
:while (< pos n)
:for (float chars) := (multiple-value-list
(read-from-string line nil nil :start pos))
:collect float))
Also, you might want to consider using WITH-INPUT-FROM-STRING
instead of READ-FROM-STRING
, which makes things even simpler.
(defun parse-string-to-float (line)
(with-input-from-string (s line)
(loop
:for num := (read s nil nil)
:while num
:collect num)))
As for performance, you might want to do some profiling, and ensure that you are actually compiling your function.
EDIT to add: One thing you do need to be careful of is that the reader can introduce a security hole if you're not sure of the source of the string. There's a read macro, #.
, which can allow evaluation of arbitrary code following it when it's read from a string. The best way to protect yourself is by binding the *READ-EVAL*
variable to NIL
, which will make the reader signal an error if it encounters #.
. Alternatively, you can use one of the specialized libraries that Rainer Joswig mentions in his answer.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With