I'm new to Common Lisp and ran into a performance thing that just struck me as weird. I'm checking if a number is divisible by 10 using rem in a loop. If I move the check into a function, it runs 5x slower. What would cause that?
I'm running sbcl 1.4.5 on 64 bit Ubuntu 18.04.
(defun fn (x)
(= 0 (rem x 10))
)
(defun walk-loop-local (n)
(loop for i from 1 to n do
(= 0 (rem i 10))
))
(defun walk-loop-func (n)
(loop for i from 1 to n do
(fn i)
))
(time (walk-loop-local 232792560))
(time (walk-loop-func 232792560))
I'd expect the time to be the same (and a lot faster, but that's a separate question). Instead, here's the output,
CL-USER> (load "loops.lisp")
Evaluation took:
0.931 seconds of real time
0.931389 seconds of total run time (0.931389 user, 0.000000 system)
100.00% CPU
2,414,050,454 processor cycles
0 bytes consed
Evaluation took:
4.949 seconds of real time
4.948967 seconds of total run time (4.948967 user, 0.000000 system)
100.00% CPU
12,826,853,706 processor cycles
0 bytes consed
Common Lisp allows dynamic redefinition of functions: if you redefined fn
during the approx. 5 seconds of your second test, the running loop would switch to calling the new definition of fn
while running. This features comes with some constraints on how to compile function calls and how to optimize them when needed.
As pointed out by RainerJoswing in comments, the above is an over-simplification, there are cases where the compiler may assume functions are not redefined (recursive functions, functions in the same file), see 3.2.2.3 Semantic Constraints, for example:
A call within a file to a named function that is defined in the same file refers to that function, unless that function has been declared notinline. The consequences are unspecified if functions are redefined individually at run time or multiply defined in the same file.
A function mixes error checking and the computations you want it to perform. At function call boundaries you typically have a prologue where your inputs are checked, and an epilogue where results might be "boxed": if the compiler knows that locally a variable is always a single-float, it can use a raw representation of floats during the extent of the function, but when returning the result, it should be a valid Lisp type, which means coercing it back to a tagged value, for example.
The SBCL compiler tries to ensure the code is safe, where safe means never invoking code that has undefined behaviour in the Lisp specification. Note however that if you call fn
with a string input, the code is expected to detect the type error. Unlike C, a type-error at runtime in Lisp is well-defined (as long as the declared type, which defaults to T, encompasses all possible values at runtime). And so, compiling Lisp code for safety tends to add a lot of error checking at multiple points of the program.
Optimizing code consists in removing checks that are guaranteed to be always true, eliminating dead branches in the generated code.
For example, if you consider fn
alone, you can see that it has to check its input every time it is called, because it might very well be called with a string input. But when you directly inline the operation, then the index i
can be statically determined to be an integer, which allows calls to =
and rem
to be applied without (much) error checking.
Optimization in SBCL happens because there is a static analysis which maps variables to elements of the type lattice of Lisp (and
and or
are basically the greatest lower bound and lowest upper bound for types, with types T
and type nil
at both ends). SBCL reports only errors that are sure to happen: you have an error if you call a function that accepts integers from 0 to 5 if you call it with an input that is known to always be above 5 or below zero (both sets have no intersection), but you have no warning if you call it with an integer between 2 and 10. This is safe because the compiler can defer error checking at runtime, contrary to other languages where the runtime has no sense of types (trying to warn everytime the code might have errors would result in a lot of warnings given the open-worldness of Lisp).
You can (declaim (inline fn))
in your file and then the performance will be identical to the first version. A rule of thumb is that inside a function, things are a bit more static than in the global environment: local functions cannot be redefined, local variables can have their types precisely defined, etc. You have more control about what is always true.
Note that the overhead of error checking is a problem if it is executed a lot of time (relatively to the rest of the code). If you fill a big array with single-floats and apply numerical code on it, it makes sense to use a specialized array type, like (simple-array single-float)
, or to declare local variables to be floats with (declare (type single-float x))
, so that you don't check that each value is effectively a float. In other cases, the overhead is not high enough to spend too much time reducing it.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With