Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Idiomatic way to represent sum type (Either a b) in Clojure

Edited. My question now is: what idiomatic Clojure constructs are usually used instead of sum types in statically types languages? Consensus so far: use protocols if behaviour can be unified, use tagged pairs/maps otherwise, put necessary asserts in pre- and post-conditions.

Clojure provides many ways to express product types: vectors, maps, records..., but how do you represent sum types, also known as tagged unions and variant records? Something like Either a b in Haskell or Either[+A, +B] in Scala.

The first thing which comes to my mind is a map with a special tag: {:tag :left :value a}, but then all the code is going to be polluted with conditionals on (:tag value) and handling special cases if it's not there... What I'd like to ensure, is that :tag is always there, and it can take only one of the specified values, and corresponding value is consistently of the same type/behaviour and cannot be nil, and there is an easy way to see that I took care of all cases in the code.

I can think of a macro in the lines of defrecord, but for the sum types:

; it creates a special record type and some helper functions
(defvariant Either
   left Foo
   right :bar)
; user.Either

(def x (left (Foo. "foo")))   ;; factory functions for every variant
; #user.Either{:variant :left :value #user.Foo{:name "foo"}}
(def y (right (Foo. "bar")))  ;; factory functions check types
; SomeException...
(def y (right ^{:type :bar} ()))
; #user.Either{:variant :right :value ()}

(variants x) ;; list of all possible options is intrinsic to the value
; [:left :right]

Does a thing like this already exist? (Answered: no).

like image 579
sastanin Avatar asked Jun 08 '12 11:06

sastanin


4 Answers

how do you represent sum types, also known as tagged unions and variant records? Something like Either a b in Haskell or Either[+A, +B] in Scala.

Either has two uses: to return a value of one of two types or to return two values of the same type that should have different semantics based on the tag.

The first use is only important when using a static type system. Either is basically the minimum solution possible given the constraints of the Haskell type system. With a dynamic type system, you can return values of any type you want. Either is not needed.

The second use is significant but can be accomplished quite simply in two (or more) ways:

  1. {:tag :left :value 123} {:tag :right :value "hello"}
  2. {:left 123} {:right "hello"}

What I'd like to ensure, is that :tag is always there, and it can take only one of the specified values, and corresponding value is consistently of the same type/behaviour and cannot be nil, and there is an easy way to see that I took care of all cases in the code.

If you would like to ensure this statically, Clojure is probably not your language. The reason is simple: expressions do not have types until runtime--until they return a value.

The reason a macro will not work is that at macro expansion time, you do not have runtime values--and hence runtime types. You have compile-time constructs like symbols, atoms, sexpressions, etc. You can eval them, but using eval is considered bad practice for a number of reasons.

However, we can do a pretty good job at runtime.

  • What I'd like to ensure, is that :tag is always there,
  • and it can take only one of the specified values
  • and corresponding value is consistently of the same type/behaviour
  • and cannot be nil
  • and there is an easy way to see that I took care of all cases in the code.

My strategy will be to convert everything that is normally static (in Haskell) to runtime. Let's write some code.

;; let us define a union "type" (static type to runtime value)
(def either-string-number {:left java.lang.String :right java.lang.Number})

;; a constructor for a given type
(defn mk-value-of-union [union-type tag value]  
  (assert (union-type tag)) ; tag is valid  
  (assert (instance? (union-type tag) value)) ; value is of correct type  
  (assert value)  
  {:tag tag :value value :union-type union-type}) 

;; "conditional" to ensure that all the cases are handled  
;; take a value and a map of tags to functions of one argument
;; if calls the function mapped to the appropriate tag
(defn union-case-fn [union-value tag-fn]
  ;; assert that we handle all cases
  (assert (= (set (keys tag-fn))
             (set (keys (:union-type union-value)))))
  ((tag-fn (:tag union-value)) (:value union-value)))

;; extra points for wrapping this in a macro

;; example
(def j (mk-value-of-union either-string-number :right 2))

(union-case-fn j {:left #(println "left: " %) :right #(println "right: " %)})
=> right: 2

(union-case-fn j {:left #(println "left: " %)})
=> AssertionError Assert failed: (= (set (keys tag-fn)) (set (keys (:union-type union-value))))

This code uses the following idiomatic Clojure constructs:

  • Data-driven programming: create a data structure which represents the "type". This value is immutable and first-class and you have the entire language available to implement logic with it. This is something that I don't believe Haskell can do: manipulate types at runtime.
  • Using maps to represent values.
  • Higher-order programming: passing a map of fns to another function.

You could optionally use protocols if you are using Either for polymorphism. Otherwise, if you are interested in the tag, something of the form {:tag :left :value 123} is the most idiomatic. You will often see something like this:

;; let's say we have a function that may generate an error or succeed
(defn somefunction []
  ...
  (if (some error condition)
    {:status :error :message "Really bad error occurred."}
    {:status :success :result [1 2 3]}))

;; then you can check the status
(let [r (somefunction)]
  (case (:status r)
    :error
    (println "Error: " (:message r))
    :success
    (do-something-else (:result r))
    ;; default
    (println "Don't know what to do!")))
like image 68
Eric Normand Avatar answered Nov 12 '22 21:11

Eric Normand


In general, sum types in dynamically typed languages are represented as:

  • tagged pairs (e.g. a product type with a tag representing the constructor)
  • case analysis on the tag at runtime to do a dispatch

In a statically typed language, most values are distinguished by types -- meaning you do not need to do runtime tag analysis to know whether you have an Either or a Maybe -- so you just look at the tag to know if it is a Left or a Right.

In a dynamically typed setting, you must do runtime type analysis first (to see what type of value you have), and then the case analysis of the constructor (to see which flavor of value you have).

One way is to assign a unique tag for every constructor of every type.

In a way, you can think of dynamic typing as putting all values into a single sum type, deferring all type analysis to runtime tests.


What I'd like to ensure, is that :tag is always there, and it can take only one of the specified values, and corresponding value is consistently of the same type/behaviour and cannot be nil, and there is an easy way to see that I took care of all cases in the code.

As an aside, this is pretty much a description of what a static type system would do.

like image 42
Don Stewart Avatar answered Nov 12 '22 20:11

Don Stewart


Without the completion of something mind-blowing like typed clojure I don't think you can avoid runtime checking of assertions.

One lesser known feature provided by clojure which can definitely help with the runtime checks is the implementation of pre- and post-conditions (see http://clojure.org/special_forms and a blog post by fogus). I think you could even use a single higher order wrapper function with pre- and post-conditions to check all your assertions on the relevant code. That avoids the runtime check "pollution problem" quite nicely.

like image 22
Alex Stoddard Avatar answered Nov 12 '22 21:11

Alex Stoddard


the reason this works so well in some languages is that you dispatch (usually by type) on the result - ie you use some property (usually type) of the result to decide what to do next.

so you need to look at how dispatch can happen in clojure.

  1. nil special case - the nil value is special-cased in various places and can be used as the "None" part of "Maybe". for example, if-let is very useful.

  2. pattern matching - base clojure doesn't have much support for this, apart from destructuring sequences, but there are various libraries that do. see Clojure replacement for ADTs and Pattern Matching? [update: in the comments mnicky says that is outdated and you should use core.match]

  3. by type with OO - methods are selected by type. so you can return different subclasses of a parent and call a method that is overloaded to do the different operations you want. if you're coming from a functional background that will feel very odd/clumsy, but it's an option.

  4. tags by hand - finally, you can use case or cond with explicit tags. more usefully, you can wrap them in some kind of macro that works the way you want.

like image 4
andrew cooke Avatar answered Nov 12 '22 19:11

andrew cooke