Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why doesn't Haskell have symbols (a la ruby) / atoms (a la erlang)?

The two languages where I have used symbols are Ruby and Erlang and I've always found them to be extremely useful.

Haskell does have algebraic datatypes, but I still think symbols would be mighty convenient. An immediate use that springs to mind is that since symbols are isomorphic to integers you can use them where you would use an integral or a string "primary key".

The syntactic sugar for atoms can be minor - :something or <something> is an atom. All atoms are instances of a Type called Atom which derives Show and Eq. You can then use it for more descriptive error codes, for example

type ErrorCode = Atom
type Message = String
data Error = Error ErrorCode Message
loginError = Error :redirect "Please login first"

In this case :redirect is more efficient than using a string ("redirect") and easier to understand than an integer (404).

The benefit may seem minor, but I say it is worth adding atoms as a language feature (or at least a GHC extension).

So why have symbols not been added to the language? Or am I thinking about this the wrong way?

like image 267
Anupam Jain Avatar asked May 28 '11 20:05

Anupam Jain


2 Answers

I agree with camccann's answer that it's probably missing mainly because it would have to be baked quite deeply into the implementation and it is of too little use for this level of complication. In Erlang (and Prolog and Lisp) symbols (or atoms) usually serve as special markers and serve mostly the same notion as a constructor. In Lisp, the dynamic environment includes the compiler, so it's partly also a (useful) compiler concept leaking into the runtime.

The problem is the following, symbol interning is impure (it modifies the symbol table). Because we never modify an existing object it is referentially transparent, however, but if implemented naïvely can lead to space leaks in the runtime. In fact, as currently implemented in Erlang you can actually crash the VM by interning too many symbols/atoms (current limit is 2^20, I think), because they can never get garbage collected. It's also difficult to implement in a concurrent setting without a huge lock around the symbol table.

Both problems can be (and have been) solved, however. For example, see Erlang EEP 20. I use this technique in the simple-atom package. It uses unsafePerformIO under the hood, but only in (hopefully) rare cases. It could still use some help from the GC to perform an optimisation similar to indirection shortening. It also uses quite a few IORefs internally which isn't too great for performance and memory usage.

In summary, it can be done but implementing it properly is non-trivial. Compiler writers always weigh the power of a feature against its implementation and maintenance efforts, and it seems like first-class symbols lose out on this one.

like image 163
nominolo Avatar answered Oct 10 '22 20:10

nominolo


I think the simplest answer is that, of the things Lisp-style symbols (which is where both Ruby and Erlang got the idea, I believe) are used for, in Haskell most are either:

  • Already done in some other fashion--e.g. a data type with a bunch of nullary constructors, which also behave as "convenient names for integers".

  • Awkward to fit in--things that exist at the level of language syntax instead of being regular data usually have more type information associated with them, but symbols would have to either be distinct types from each other (nearly useless without some sort of lightweight ad-hoc sum type) or all the same type (in which case they're barely different from just using strings).

Also, keep in mind that Haskell itself is actually a very, very small language. Very little is "baked in", and of the things that are most are just syntactic sugar for other primitives. This is a bit less true if you include a bunch of GHC extensions, but GHC with -XAndTheKitchenSinkToo is not the same language as Haskell proper.

Also, Haskell is very amenable to pseudo-syntax and metaprogramming, so there's a lot you can do even without having it built in. Particularly if you get into TH and scary type metaprogramming and whatever else.

So what it mostly comes down to is that most of the practical utility of symbols is already available from other features, and the stuff that isn't available would be more difficult to add than it's worth.

like image 36
C. A. McCann Avatar answered Oct 10 '22 19:10

C. A. McCann