Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Haskell terminology: meaning of type vs. data type, are they synonyms?

I'm reading the book: Haskell School of Expression and on page 56, at the beginning of chapter 5, I read the terms "polymorphic data types" and "polymorphic types".

Are these two terms refer to the same concept?

Are they synonyms ?

Or is there any difference between the two? If yes, what?

like image 782
jhegedus Avatar asked Jul 18 '14 13:07

jhegedus


2 Answers

A type (in Haskell) is a piece of syntax which can meaningfully be put right of :: to classify an expression left of ::. Each syntactic component of a type is itself classified by a kind, where the kind of types (which classify expressions) is *. Some people are happy to use the word "type" to refer to any component of the type syntax, whether or not its kind allows it to classify expressions.

The syntax of types can be extended by various declaration forms.

  1. A type synonym, e.g., type Foo x y z = [x] -> IO (y, z), adds type components of fully applied form Foo x y z, which expand macro-fashion in accordance with their defining equation.
  2. A data declaration, e.g., data Goo x y z = ThisGoo x | ThatGoo (Goo y z x) introduces a fresh type constructor symbol Goo to the syntax of types, which is used to build the types which classify values generated by the data constructors, here ThisGoo and ThatGoo.
  3. A newtype declaration, e.g., newtype Noo x y z = MkNoo (x, [y], z) makes a copy of an existing type which is distinguished from the original in the syntax of types.

A type is polymorphic if it contains type variables which can be substituted with other type components: the values classified by polymorphic types can be specialized to any substitution instance of the type variables. E.g. append (++) :: [a] -> [a] -> [a] works on lists whose elements have the same type, but any type will do. Values with polymorphic types are often referred to as "polymorphic values".

Sometimes, "data type" is used to mean, quite simply, a type introduced by a data declaration. In this sense, all data types are types, but not all types are data types. Examples of types which are not data types include IO () and Int -> Int. Also, Int is not a data type in this sense: it is a hardwired primitive type. For the avoidance of doubt, some people call these types algebraic data types, because the constructors give an algebra, meaning "a bunch of operations for building values by combining other values". A "polymorphic data type" is a data type with type variables in it, such as [(a, Bool)], by contrast with [Int]. Sometimes people talk about "declaring a polymorphic data type" or say things like "Maybe is a polymorphic data type" when they really just mean that the type constructor has parameters (and can thus be used to form polymorphic types): pedantically, one does declare a polymorphic data type, but not any old polymorphic datatype, rather a type constructor applied to formal parameters).

Of course, all first-class values classified by types are in some sense "data", and in Haskell, types are not used to classify anything which is not a first-class value, so in that looser sense, every "type" is a "data type". The distinction becomes more meaningful in languages where there are things other than data which have types (e.g., methods in Java).

Informal usage is often somewhere in the middle and not very well defined. People are often driving at some sort of distinction between functions or processes and the sort of stuff (the "data") on which they operate. Or they might think of data as being "understood in terms of the way they're made" (and exposing their representation, e.g. by pattern matching) rather than "understood in terms of the way they're used". This last usage of "data" sits a little uncomfortably with the notion of an abstract data type, being a type which hides the representation of the underlying stuff. Representation-hiding abstract data types thus contrast rather strongly with representation-exposing algebraic data types, which is why it is rather unfortunate that "ADT" is casually used as an abbreviation for both.

The upshot, I'm afraid, is vague.

like image 121
pigworker Avatar answered Sep 18 '22 15:09

pigworker


In this case, data type and type are synonymous. However, I will admit that confusion can arise because Haskell has two keywords data and type that perform two very different functions. To try to keep the distinction clear, it's important to be mindful of the context. Whenever you're talking about the types in a signature or types in general, the terms "data types" and "types" almost always refer to the same thing. Whenever you're talking about declaring types in code, there can be a difference.

A type declared with data is a new, user-defined type, so you can do things like

data Status = Ready | NotReady | Exploded

Where Ready, NotReady and Exploded are new constructors not included in Haskell.

On the other hand, there is the type keyword that simply creates an alias to an existing type:

type Status = String

ready, notReady, exploded :: Status
ready = "Ready"
notReady = "NotReady"
exploded = "Exploded"

Here, the Status is simply an alias for String, and anywhere you use String you can use a Status and vice-versa. There aren't any constructors, just pre-built values to use. This approach is far less safe and if you use something like this you will run into bugs at some point. type declarations are commonly used to make certain arguments more clear about what they're for, such as

type FilePath = String

This is a built-in alias in GHC, and if you see a function

doSomething :: FilePath -> IO ()

Then you know immediately to pass it a file name, compared to

doSomething :: String -> IO ()

You have no idea what this function does, other than "something". They're also commonly used to reduce typing, such as:

type Point = (Double, Double)

Now you can use Point instead of (Double, Double) in your type signatures, which is shorter to write and more readable.


To summarize, data declares an entirely new type, completely custom just for you, and type should be renamed alias so that people stop getting confused about them when they first approach Haskell.

like image 41
bheklilr Avatar answered Sep 19 '22 15:09

bheklilr