Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does Clojure's optimiser work, and where is it?

I am new to Clojure, but not to lisp. A few of the design decisions look strange to me - specifically requiring a vector for function parameters and explicitly requesting tail calls using recur.

Translating lists to vectors (and vice versa) is a standard operation for an optimiser. Tail calls can be converted to iteration by rewriting to equivalent clojure before compiling to byte code. The [] and recur syntax suggest that neither of these optimisations are present in the current implementation.

I would like a pointer to where in the implementation I can find any/all source-to-source transformation passes. I don't speak Java very well so am struggling to navigate the codebase.

If there isn't any optimisation before function-by-function translation to the JVM's byte code, I'd be interested in the design rationale for this. Perhaps to achieve faster compilation?

Thank you.

like image 549
Jon Chesterfield Avatar asked Mar 14 '23 08:03

Jon Chesterfield


2 Answers

There is no explicit optimizer package in the compiler code. Any optimizations are done "inline". Some can be enabled or disabled via compiler flags.


Observe that literal vectors for function parameters are a syntactic choice how functions are represented in source code. Whether they are represented as vectors or list or anything else would not affect runtime and cannot be optimized hence.

Regarding automatic recur, Rich Hickey explained his decision here:

When speaking about general TCO, we are not just talking about recursive self-calls, but also tail calls to other functions. Full TCO in the latter case is not possible on the JVM at present whilst preserving Java calling conventions (i.e without interpreting or inserting a trampoline etc).

While making self tail-calls into jumps would be easy (after all, that's what recur does), doing so implicitly would create the wrong expectations for those coming from, e.g. Scheme, which has full TCO. So, instead we have an explicit recur construct.

Essentially it boils down to the difference between a mere optimization and a semantic promise. Until I can make it a promise, I'd rather not have partial TCO.

Some people even prefer 'recur' to the redundant restatement of the function name. In addition, recur can enforce tail-call position.

like image 56
Leon Grapenthin Avatar answered Mar 20 '23 06:03

Leon Grapenthin


specifically requiring a vector for function parameters

Most other lisps build structures out of syntactic lists. For an associative "map" for example, you build a list of lists. For a "vector", you make a list. For a conditional switch-like expression, you make a list of lists of lists. Lots of lists, lots of parenthesis.

Clojure has made it an obvious goal to make the syntax of lisp more readable and less redundant. A map, set, list, vector all have their own syntax delimiters so they jump out at the eye, while also providing specific functionality that otherwise you'd have to explicitly request using a function if they were all lists. In addition to these structural primitives, other functions like cond minimize the parentheses by removing one layer of parentheses for each pair in the expression, rather than additionally wrapping each pair in yet another grouped parenthesis. This philosophy is widespread throughout the language and its core library so the code is more readable and elegant.

Function parameters as a vector are just part of this syntax. It's not about whether the language can convert a list to a vector easily, it's about how the language requires the placement of function parameters in a function definition -- and it does so by explicitly requiring a vector. And in fact, you can see this clearly in the source for defn:

https://github.com/clojure/clojure/blob/clojure-1.7.0/src/clj/clojure/core.clj#L296

It's just a requirement for how a function is written, that's all.

like image 40
johnbakers Avatar answered Mar 20 '23 06:03

johnbakers