Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does Clojure distinguish between symbols and vars?

I saw this question already, but it doesn't explain what I am wondering about.

When I first came to Clojure from Common Lisp, I was puzzled why it treats symbols and keywords as separate types, but later I figured it out, and now I think it is a wonderful idea. Now I am trying to puzzle out why symbols and vars are separate objects.

As far I know, Common Lisp implementations generally represent a "symbol" using a structure which has 1) a string for the name, 2) a pointer to the symbol's value when evaluated in function call position, 3) a pointer to its value when evaluated outside call position, and 4) property list, etc.

Ignoring the Lisp-1/Lisp-2 distinction, the fact remains that in CL, a "symbol" object points directly to its value. In other words, CL combines what Clojure calls a "symbol" and a "var" in a single object.

In Clojure, to evaluate a symbol, first the corresponding var must be looked up, then the var must be dereferenced. Why does Clojure work this way? What benefit could there possibly be from such a design? I understand that vars have certain special properties (they can be private, or const, or dynamic...), but couldn't those properties simply be applied to the symbol itself?

like image 235
Alex D Avatar asked Jul 26 '12 03:07

Alex D


People also ask

What are Symbols in Clojure?

In Common Lisp, a "symbol" is a location in memory, a place where data can be stored. The "value" of a symbol is the data stored at that location in memory. In Clojure, a "symbol" is just a name. It has no value.

What is Var in Clojure?

Advertisements. In Clojure, variables are defined by the 'def' keyword. It's a bit different wherein the concept of variables has more to do with binding. In Clojure, a value is bound to a variable.


2 Answers

Other questions have touched on many true aspects of symbols, but I'll try explaining it from another angle.

Symbols are names

Unlike most programming languages, Clojure makes a distinction between things and the names of things. In most languages, if I say something like var x = 1, then it is correct and complete to say "x is 1" or "the value of x is 1". But in Clojure, if I say (def x 1), I've done two things: I've created a Var (a value-holding entity), and I've named it with the symbol x. Saying "the value of x is 1" doesn't quite tell the whole story in Clojure. A more accurate (although cumbersome) statement would be "the value of the var named by the symbol x is 1".

Symbols themselves are just names, while vars are the value-carrying entities and don't themselves have names. If extend the earlier example and say (def y x), I haven't created a new var, I've just given my existing var a second name. The two symbols x and y are both names for the same var, which has the value of 1.

An analogy: my name is "Luke", but that isn't identical with me, with who I am as a person. It's just a word. It's not impossible that at some point I could change my name, and there are many other people that share my name. But in the context of my circle of friends (in my namespace, if you will), the word "Luke" means me. And in a fantasy Clojure-land, I could be a var carrying around a value for you.

But why?

So why this extra concept of names as distinct from variables, rather than conflating the two as most languages do?

For one thing, not all symbols are bound to vars. In local contexts, such as function arguments or let bindings, the value referred to by a symbol in your code isn't actually a var at all - it's just a local binding that will be optimized away and transformed to raw bytecode when it hits the compiler.

Most importantly, though, it's part of Clojure's "code is data" philosophy. The line of code (def x 1) isn't just an expression, it's also data, specifically a list consisting of the values def, x, and 1. This is very important, particularly for macros, which manipulate code as data.

But if (def x 1) is a list, than what are the values in the list? And particularly, what are the types of those values? Obviously 1 is a number. But what about def and x? What is their type, when I'm manipulating them as data? The answer, of course, symbols.

And that is the main reason symbols are a distinct entity in Clojure. In some contexts, such as macros, you want to take names and manipulate them, divorced from any particular meaning or binding granted by the runtime or the compiler. And the names must be some sort of thing, and the sort of thing they are is symbols.

like image 67
levand Avatar answered Sep 21 '22 10:09

levand


After giving this question a lot of thought, I can think of several reasons to differentiate between symbols and vars, or as Omri well put it, to use "two levels of indirection for mapping symbols to their underlying values". I will save the best one for last...

1: By separating the concepts of "a variable" and "an identifier which can refer to a variable", Clojure makes things a bit cleaner conceptually. In CL, when the reader sees a, it returns a symbol object which carries pointers to top-level bindings, even if a is locally bound in the current scope. (In which case the evaluator will not make use of those top-level bindings.) In Clojure, a symbol is just an identifier, nothing more.

This connects to the point some posters made, that symbols can also refer to Java classes in Clojure. If symbols carried bindings with them, those bindings could just be ignored in contexts where the symbol refers to a Java class, but it would be messy conceptually.

2: In some cases, people might want to use symbols as map keys and such. If symbols were mutable objects (as they are in CL), they wouldn't fit well with Clojure's immutable data structures.

3: In (probably rare) cases where symbols are used as map keys, etc., and perhaps even returned by an API, the equality semantics of Clojure's symbols are more intuitive than CL's symbols. (See @amalloy's answer.)

4: Since Clojure emphasizes functional programming, a lot of work is done using higher-order functions like partial, comp, juxt, and so on. Even if you're not using these, you may still take functions as arguments to your own functions, etc.

Now, when you pass my-func to a higher-order function, it does not retain any reference to the variable which is called "my-func". It just captures the value as it is right now. If you redefine my-func later, the change will not "propagate" to other entities which were defined using the value of my-func.

Even in such situations, by using #'my-func, you can explicitly request that the current value of my-func should be looked up every time the derived function is called. (Presumably at the cost of a small performance hit.)

In CL or Scheme, if I needed this kind of indirection, I can imagine storing a function object in a cons or vector or struct, and retrieving it from there every time it was to be called. Actually, any time I needed a "mutable reference" object which could be shared between different parts of the code, I could just use a cons or other mutable structure. But in Clojure, lists/vectors/etc. are all immutable, so you need some way to refer explicitly to "something which is mutable".

like image 31
Alex D Avatar answered Sep 25 '22 10:09

Alex D