Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why do Clojure variable arity args get different types depending on use?

In answering another question I came across something I didn't expect with Clojure's variable arity function args:

user=> (defn wtf [& more] (println (type more)) :ok)
#'user/wtf

;; 1)
user=> (wtf 1 2 3 4)
clojure.lang.ArraySeq
:ok

;; 2)
user=> (let [x (wtf 1 2 3 4)] x)
clojure.lang.ArraySeq
:ok

;; 3)
user=> (def x (wtf 1 2 3 4))
clojure.lang.PersistentVector$ChunkedSeq
#'user/x
user=> x
:ok

Why is the type ArraySeq in 1) and 2), but PersistentVector$ChunkedSeq in 3)?

like image 214
overthink Avatar asked Sep 25 '14 15:09

overthink


People also ask

What is arity in Clojure?

This is where multi-arity functions come in in Clojure (an arity is simply the number of arguments that a function can take).

What is a Clojure function?

Clojure is a functional language. Functions are first-class and can be passed-to or returned-from other functions. Most Clojure code consists primarily of pure functions (no side effects), so invoking with the same inputs yields the same output.


2 Answers

Short answer: It's an obscure implementation detail of Clojure. The only thing guaranteed by the language is that the rest-param of a variadic function will be passed as an instance of clojure.lang.ISeq, or nil if there are no additional arguments. You should code accordingly.

Long answer: It has to do with whether the function call is compiled or simply evaluated. Without going into a full dissertation on the difference between evaluation and compilation, it should be sufficient to know that Clojure code gets parsed into an AST. Depending on the context, expressions in the AST could get evaluated directly (something akin to interpretation), or could get compiled into Java bytecode as part of a dynamically-generated class. The typical case where the latter happens is in the body of a lambda expression, which will evaluate to an instance of a dynamically generated class that implements the IFn interface. See the Clojure documentation for a more detailed explanation of evaluation.

The vast majority of the time, the difference between compiled and evaluated code will be invisible to your program; they will behave in exactly the same way. This is one of those rare corner cases where compilation and evaluation result in subtly different behavior. It's important to point out, though, that both behaviors are correct in that they conform to the promises made by the language.

Function calls in Clojure code get parsed into an instance of InvokeExpr in clojure.lang.Compiler. If the code is being compiled, then the compiler emits bytecode that will call the invoke method on an IFn object using an appropriate arity (Compiler.java, line 3650). If the code is just being evaluated and not compiled, then the function arguments are bundled up in a PersistentVector and passed to the applyTo method on the IFn object (Compiler.java, line 3553).

Clojure functions that have a variadic arg list are compiled into subclasses of the clojure.lang.RestFn class. This class implements all the methods of IFn, gathers arguments, and dispatches to the appropriate doInvoke arity. You can see in the implementation of applyTo that, in the case of 0 required args (as is the case in your wtf function), the input seq is passed through to the doInvoke method and visible to the function implementation. The 4-arg version of invoke, meanwhile, bundles up the arguments in an ArraySeq and passes this to the doInvoke method, so now your code sees an ArraySeq.

To complicate matters, the implementation of Clojure's eval function (which is what the REPL is calling) will internally wrap a list form being evaluated inside a thunk (an anoymous, no-arg function), then compile and execute the thunk. So almost all invocations are using compiled calls to the invoke method, rather than being interpreted directly by the compiler. There's a special case for def forms that explicitly evaluates the code without compiling, which accounts for the different behavior you're seeing there.

The implementation of clojure.core/apply also calls the applyTo method, and by this logic whatever list type passed to apply should be seen the the function body. Indeed:

user=> (apply wtf [1 2 3 4])
clojure.lang.PersistentVector$ChunkedSeq
:ok

user=> (apply wtf (list 1 2 3 4))
clojure.lang.PersistentList
:ok
like image 59
Alex Avatar answered Oct 07 '22 20:10

Alex


Clojure is for the most part not implemented in terms of Classes, but in terms of Interfaces and Protocols (a Clojure abstraction over java Interfaces with a few extra features).

user> (require '[clojure.reflect :as reflect])
nil
user> (:bases (reflect/reflect clojure.lang.ArraySeq))
#{clojure.lang.IndexedSeq clojure.lang.IReduce clojure.lang.ASeq}
user> (:bases (reflect/reflect clojure.lang.PersistentVector$ChunkedSeq))
#{clojure.lang.Counted clojure.lang.IChunkedSeq clojure.lang.ASeq}

good Clojure code doesn't work in terms of ArraySeq vs. PersistentVector$ChunkedSeq, but rather will call the methods or protocol functions exposed by IndexedSeq, IReduce, ASeq, etc. if their argument implements them. Or more likely, use the basic clojure.core functions that are implemented in terms of these interfaces or protocols.

For example, note the definition of reduce:

user> (source reduce)
(defn reduce
  "f should be a function of 2 arguments. If val is not supplied,
  returns the result of applying f to the first 2 items in coll, then
  applying f to that result and the 3rd item, etc. If coll contains no
  items, f must accept no arguments as well, and reduce returns the
  result of calling f with no arguments.  If coll has only 1 item, it
  is returned and f is not called.  If val is supplied, returns the
  result of applying f to val and the first item in coll, then
  applying f to that result and the 2nd item, etc. If coll contains no
  items, returns val and f is not called."
  {:added "1.0"}
  ([f coll]
     (clojure.core.protocols/coll-reduce coll f))
  ([f val coll]
     (clojure.core.protocols/coll-reduce coll f val)))
nil

and if you look up coll-reduce, you find various implementations based on the Interfaces or Protocols implemented: protocols.clj

(extend-protocol CollReduce
  nil
  (coll-reduce
   ([coll f] (f))
   ([coll f val] val))

  Object
  (coll-reduce
   ([coll f] (seq-reduce coll f))
   ([coll f val] (seq-reduce coll f val)))

  clojure.lang.IReduce
  (coll-reduce
   ([coll f] (.reduce coll f))
   ([coll f val] (.reduce coll f val)))

  ;;aseqs are iterable, masking internal-reducers
  clojure.lang.ASeq
  (coll-reduce
   ([coll f] (seq-reduce coll f))
   ([coll f val] (seq-reduce coll f val)))
  ...) ; etcetera
like image 31
noisesmith Avatar answered Oct 07 '22 22:10

noisesmith