Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Loop'ing over arrays or lists indifferently

Problem

Let's say you have a number of lists or arrays, let's say two for the sake of example :

(defparameter *arr* #(1 2 3))
(defparameter *list* '(4 5 6))

You can loop over them using either across or in keywords :

(loop for elem across *arr* do (format t "~a" elem))
     => 123
(loop for elem in *list* do (format t "~a" elem))
     => 456

I want to be able to loop over these arrays or lists using the same syntax. I am using SBCL and execution speed is a concern.

Using being the elements of

This syntax is nice, as it works regardless of its argument being a list or array.

(loop for elem being the elements of *arr* do (format t "~a" elem))
     => 123
(loop for elem being the elements of *list* do (format t "~a" elem))
     => 456

But its speed is horrendous. If we do a quick comparison by accessing lists or arrays of 100 elements 1M times :

(format t "# Test 1.1.1 : Accessing list of doubles with loop 'in': ") (terpri)
(let ((test-list (make-list 100 :initial-element 12.2d0))
      (testvar 0d0))
  (declare (optimize (speed 3))
           (type list test-list)
           (type double-float testvar))
  (time (dotimes (it 1000000 t) (loop for el in test-list do
                                      (setf testvar (the double-float el))))))

(format t "# Test 1.1.2 : Accessing list of doubles with loop 'elements': ") (terpri)
(let ((test-list (make-list 100 :initial-element 12.2d0))
      (testvar 0d0))
  (declare (optimize (speed 3))
           (type list test-list)
           (type double-float testvar))
  (time (dotimes (it 1000000 t) (loop for el being the elements of test-list do
                                      (setf testvar (the double-float el))))))

(format t "# Test 1.2.1 : Accessing simple-array of doubles using loop 'across' : ") (terpri)
(let ((test-array (make-array 100 :initial-element 12.2d0 :element-type 'double-float))
      (testvar 0d0))
  (declare (optimize (speed 3))
           (type double-float testvar)
           (type simple-array test-array))
  (time (dotimes (it 1000000 t) (loop for el across test-array do
                                      (setf testvar (the double-float el))))))

(format t "# Test 1.2.2 : Accessing simple-array of doubles using loop 'elements' : ") (terpri)
(let ((test-array (make-array 100 :initial-element 12.2d0 :element-type 'double-float))
      (testvar 0d0))
  (declare (optimize (speed 3))
           (type double-float testvar)
           (type simple-array test-array))
  (time (dotimes (it 1000000 t) (loop for el being the elements of test-array do
                                      (setf testvar (the double-float el))))))

It gives us :

# Test 1.1.1 : Accessing list of doubles with loop 'in': 
Evaluation took:
  0.124 seconds of real time
  0.123487 seconds of total run time (0.123471 user, 0.000016 system)
  99.19% CPU
  445,008,960 processor cycles
  672 bytes consed
  
# Test 1.1.2 : Accessing list of doubles with loop 'elements': 
Evaluation took:
  0.843 seconds of real time
  0.841639 seconds of total run time (0.841639 user, 0.000000 system)
  99.88% CPU
  3,034,104,192 processor cycles
  0 bytes consed
  
# Test 1.2.1 : Accessing simple-array of doubles using loop 'across' : 
Evaluation took:
  0.062 seconds of real time
  0.062384 seconds of total run time (0.062384 user, 0.000000 system)
  100.00% CPU
  224,896,032 processor cycles
  0 bytes consed
  
# Test 1.2.2 : Accessing simple-array of doubles using loop 'elements' : 
Evaluation took:
  1.555 seconds of real time
  1.554472 seconds of total run time (1.541572 user, 0.012900 system)
  [ Run times consist of 0.094 seconds GC time, and 1.461 seconds non-GC time. ]
  99.94% CPU
  5,598,161,100 processor cycles
  1,600,032,848 bytes consed

I think it must use the elt accessor ? Anyway the penalty in speed is unacceptable.

Trying to be smart with macros

I wrote something to be able to achieve my goal of having the same syntax for list and array. I think it's not great because it seems overly awkward, but here :

(defun forbuild (el-sym list-or-array list-or-array-sym)
  "Outputs either :
     * (for el-sym in list-or-array)
     * (for el-sym across list-or-array)
Depending on type of list-or-array.
el-sym : symbol, eg. 'it1
list-or-array : declared, actual data for list or array
list-or-array-sym : symbol name for the table, to avoid writing the data in full
                    in the 'loop' call using eval.
Example call : (forbuild 'it1 arr 'arr)"
  (cond ((typep list-or-array 'array)
         `(for ,el-sym across ,list-or-array-sym))
        ((typep list-or-array 'list)
         `(for ,el-sym in ,list-or-array-sym))))

(defun forbuild-l (l-elsyms l-lars l-larsyms)
  "forbuild but over lists of things."
  (let ((for-list nil)
        (list-temp nil))
    (loop for elem in l-elsyms
          for lar in l-lars
          for larsym in l-larsyms do
          (setf list-temp (forbuild elem lar larsym))
          (loop for word-temp in list-temp do
                (push word-temp for-list)))
    (nreverse for-list)))

(defun loop-expr (forlist body)
  "Creates the expression ready to be evaluated to execute the loop.
forlist : List of symbols to be inserted syntactically. eg.
          FOR IT1 ACROSS ARR1 FOR IT2 IN ARR2
body : all the expression after the 'for' clauses in the 'loop'."
  `(loop ,@forlist ,@body))

(defmacro looparl (element list-or-array &rest body)
  (let ((forlist (gensym)))
    `(let ((,forlist (forbuild2-l (quote ,element)
                                  (list ,@list-or-array)
                                  (quote ,list-or-array))))
       (loop-expr ,forlist (quote ,body)))))

Basically I build the right loop syntax from the arguments. The version of looparl given here can be called this way :

(let ((arr1 #(7 8 9))
      (list2 (list 10 11 12)))
    (looparl (it1 it2) (arr1 list2) do (format t "~a ~a" it1 it2) (terpri)))
=> (LOOP FOR IT1 ACROSS ARR1
  FOR IT2 IN LIST2
  DO (FORMAT T "~a ~a" IT1 IT2) (TERPRI))

The actual evaluation of this outputted expression is omitted in this example, because it doesn't work on non-global names. If we throw in an eval at the end of looparl :

(defmacro looparl (element list-or-array &rest body)
  (let ((forlist (gensym)))
    `(let ((,forlist (forbuild2-l (quote ,element)
                                  (list ,@list-or-array)
                                  (quote ,list-or-array))))
       (eval (loop-expr ,forlist (quote ,body))))))

And work on global variables, we see that we still have a speed issue, since there are evaluations happening at runtime :

(looparl (it1 it2) (*arr* *list*) for it from 100
              do (format t "~a ~a ~a" it1 it2 it) (terpri))
=> 1 4 100
   2 5 101
   3 6 102
(time (dotimes (iter 1000 t) (looparl (it1 it2) (*arr* *list*) for it from 100
              do (format t "~a ~a ~a" it1 it2 it) (terpri))))
=> Evaluation took:
  1.971 seconds of real time
  1.932610 seconds of total run time (1.892329 user, 0.040281 system)
  [ Run times consist of 0.097 seconds GC time, and 1.836 seconds non-GC time. ]
  98.07% CPU
  1,000 forms interpreted
  16,000 lambdas converted
  7,096,353,696 processor cycles
  796,545,680 bytes consed

The macros are evaluated each one at a time a thousand times. But surely the type is known at compile time no ? The type of syntax in looparl is very nice, and I'd like to be able to use it without speed penalty.

I read this note in Peter Seibel's book Practical Common Lisp, chapter "Loop for Black Belts"

3 You may wonder why LOOP can't figure out whether it's looping over a list or a vector without needing different prepositions. This is another consequence of LOOP being a macro: the value of the list or vector won't be known until runtime, but LOOP, as a macro, has to generate code at compile time. And LOOP's designers wanted it to generate extremely efficient code. To be able to generate efficient code for looping across, say, a vector, it needs to know at compile time that the value will be a vector at runtime--thus, the different prepositions are needed.

Am I committing some big Common-Lisp nonsense ? How would you go about creating a working, quick looparl ?

Edit 1 : FOR library

Thank you Ehvince for the reference to the FOR library. The over keyword in the for:for function is indeed exactly what I'd need. However the benchmarks are really underwhelming :

(let ((test-list (make-list 100 :initial-element 12.2d0))
      (testvar 0d0))
  (declare (optimize (speed 3))
           (type list test-list)
           (type double-float testvar))
  (time (dotimes (it 1000000 t) 
          (for:for ((el over test-list))
            (setf testvar (the double-float el))))))

(let ((test-array (make-array 100 :initial-element 12.2d0))
      (testvar 0d0))
  (declare (optimize (speed 3))
           (type simple-array test-array)
           (type double-float testvar))
  (time (dotimes (it 1000000 t) 
          (for:for ((el over test-array))
            (setf testvar (the double-float el))))))

Evaluation took:
  4.802 seconds of real time
  4.794485 seconds of total run time (4.792492 user, 0.001993 system)
  [ Run times consist of 0.010 seconds GC time, and 4.785 seconds non-GC time. ]
  99.83% CPU
  17,286,934,536 processor cycles
  112,017,328 bytes consed
  
Evaluation took:
  6.758 seconds of real time
  6.747879 seconds of total run time (6.747879 user, 0.000000 system)
  [ Run times consist of 0.004 seconds GC time, and 6.744 seconds non-GC time. ]
  99.85% CPU
  24,329,311,848 processor cycles
  63,995,808 bytes consed

The speed of this library using the specialized keywords in and across is the same as for the standard loop. But very slow with over.

Edit 2 : map and etypecase

Thank you sds and Rainer Joswig for the suggestions. It would indeed work for the simple case in which I would only have one array/list to iterate over. Let me tell you about one use case I had in mind : I was implementing a gnuplot wrapper, both as training and to have my own program in my toolkit. I wanted to take from the user lists or arrays indifferently to serve as series to pipe to gnuplot. This is why I need to be able to loop over multiple array/lists simultaneously + using the elegant loop clauses for iteration number etc.

In this use case (gnuplot wrapper), I only have two or three for clauses in my loop for each "data block", so I have thought of writing each combination depending on the type of input by hand and it is possible, but very awkward. And I'd be stuck if I had to do something like :

(loop for el1 in list1
      for el2 across arr1
      for el3 in list2
      for el4 in list3
      ...)

With the list-i and arr-i being inputs. Another fallback plan for this use case is just to convert everything to arrays.

I thought that since it is quite easily conceptualized, I could write something flexible and fast once and for all, but there must be a reason why it is neither in the specs nor in SBCL-specific code.

like image 213
Thomas Houllier Avatar asked Dec 23 '22 04:12

Thomas Houllier


1 Answers

What you are looking for is called map: both

(map nil #'princ '(1 2 3))

and

(map nil #'princ #(1 2 3))

print 123.

However, lists and arrays are very different beasts, and it is best to decide in advance which one you want to use.

like image 157
sds Avatar answered Jan 31 '23 08:01

sds