Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Common Lisp best practices to use type declarations for optimization

I have a Common Lisp function that merges two ordered lists of symbols, without duplicates (two ordered sets):

(defun my-merge (x y)
  "merge two lists of symbols *already sorted and without duplicates*
   (and return the resulting list sorted and without duplicates)"
  (let* ((first (cons nil nil))
         (last first))
    (loop while (and x y)
       for cx = (car x)
       for cy = (car y)
       if (string= cx cy)
       do (setf x (cdr x))
       else if (string< cx cy)
       do (rplacd last (cons cx nil))
       and do (setf last (cdr last) 
                    x (cdr x))
       else do (rplacd last (cons cy nil))
       and do (setf last (cdr last) 
                    y (cdr y)))
    (rplacd last (or x y))
    (cdr first)))

Since I have found only scarce information about the use of type declarations in practical cases in order to compile efficiently the code, I am unsure if it is sufficient to declare the variables, for instance in this way:

(defun my-merge (x y)
  "merge two list of symbols *already sorted and without duplicates*"
  (declare (list x y))
  (let* ((first (cons nil nil))
         (last first))
    (declare (cons first last))
    (loop while (and x y)
       for cx symbol = (car x)
       for cy symbol = (car y)
       ...

or, as I suppose, if it is necessary also to add the the specifier to my code? But then, where and in which cases should I add it?

There is some rule that one can follow?

Should I also declare the type of my functions, again for the optimization purposes?

like image 905
Renzo Avatar asked Feb 08 '23 01:02

Renzo


1 Answers

Style

Since you don't actually use the extended LOOP features in any useful way and the LOOP syntax isn't that great for your example, I would propose to write it with the primitive LOOP. See how COND makes it more readable for a Lisp programmer:

(defun my-merge (x y &aux (first (list nil)) (last first) cx cy)
  (macrolet ((cdr! (v)
               `(setf ,v (cdr ,v))))
    (loop (unless (and x y)
            (return))
          (setf cx (car x) cy (car y))
          (cond ((string= cx cy)
                 (cdr! x))
                ((string< cx cy)
                 (rplacd last (list cx))
                 (cdr! last)
                 (cdr! x))
                (t
                 (rplacd last (list cy))
                 (cdr! last)
                 (cdr! y))))
    (rplacd last (or x y))
    (cdr first)))

Compiling

Given the level of sophistication of a compiler:

  • fully stupid = compiler ignores all declarations -> declarations don't help

  • mostly stupid = compiler needs declarations everywhere, but optimizes -> you need to write a lot of declarations

example:

(let ((a 1) (b 2))
  (declare (integer a b))
  (let ((c (the integer (* (the integer (+ a b))
                           (the integer (- a b))))))
     (declare (integer c))
     (the integer (* c c))))

Note that it might not enough to know what the argument types are, it might be necessary to declare the type of results. Thus the use of the. DISASSEMBLE and the profiler are your friends.

  • basic = compiler needs type declarations, optimizes, but also can infer some types. Types for the standard language is known.

Even better compilers complain about type errors, can propagate types across functions and can complain when certain optimizations are not possible.

Sequence functions

Note that sequence functions are a particular tough case. Sequences have as subtypes lists and vectors (including strings).

Let's say a sequence function is:

(foo result-type sequence-1 sequence-2 fn)
  • if the sequences are of the same type, one might want to have an optimized code versions for lists and another one for vectors.

  • if the sequences are of different types, it might be useful to convert one sequences to a different type. Maybe not.

  • the result type also has influence, depending on result types, different algorithms may be possible/necessary

So the degree of freedom is quite high. The compiler might contribute to fast code. But also the implementation of the particular sequence function might be able to do some optimization at runtime.

Then fn is a function which takes elements and produces new elements. It might be helpful to know its type signature - or not.

I can't really say which current Common Lisp has a sophisticated implementation of the sequence functions. Though I remember that the Symbolics Common Lisp implementations put some effort into it.

Documentation and papers

Often what the compiler can optimize and how is not well documented, if at all. There are some papers about this topic, but often they are old and/or outdated.

  • The Python compiler of CMUCL: The Compiler.
  • The Python compiler of CMUCL: Advanced Compiler Use.
  • The Python compiler for CMU Common Lisp (Postscript)
  • SBCL Compiler
  • Allegro CL: Compiling
  • LispWorks: Optimizing your code
  • Performance beyond expectations
  • How to make Lisp code go faster than C
  • An evaluation of major Lisp compilers
like image 85
Rainer Joswig Avatar answered May 08 '23 11:05

Rainer Joswig