I was translating the following Haskell code to OCaml:
data NFA q s = NFA
{ intialState :: q
, isAccepting :: q -> Bool
, transition :: q -> s -> [q]
}
Initially I tried a very literal translation:
type ('q,'s) nfa = NFA of { initialState: 'q;
isAccepting: 'q -> bool;
transition: 'q -> 's -> 'q list }
...and of course this gives a syntax error because the type constructor part, "NFA of" isn't allowed. It has to be:
type ('q,'s) nfa = { initialState: 'q;
isAccepting: 'q -> bool;
transition: 'q -> 's -> 'q list }
That got me to wondering why this is so. Why can't you have the type constructor for a record type just as you could for a tuple type (as below)?
type ('q, 's) dfa = NFA of ('q * ('q->bool) * ( 'q -> 's -> 'q list) )
Why would you want a type constructor for record types, except because that's your habit in Haskell?
In Haskell, records are not exactly first-class constructs: they are more like a syntactic sugar on top of tuples. You can define record fields name, use them as accessors, and do partial record update, but that desugars into access by positions in plain tuples. The constructor name is therefore necessary to tell one record from another after desugaring: if you had no constructor name, two records with different field names but the same field types would desugar into equivalent types, which would be a bad thing.
In OCaml, records are a primitive notion and they have their own identity. Therefore, they don't need a head constructor to distinguish them from tuples or records of the same field types. I don't see why you would like to add a head constructor, as this is more verbose without giving more information or helping expressivity.
Why can't you have the type constructor for a record type just as you could for a tuple type (as below)?
Be careful ! There is no tuple in the example you show, only a sum type with multiple parameters. Foo of bar * baz
is not the same thing as Foo of (bar * baz)
: the former constructor has two parameters, and the latter constructor has only one parameter, which is a tuple. This differentiation is done for performances reasons (in memory, the two parameters are packed together with the constructor tag, while the tuple creates an indirection pointer). Using tuples instead of multi-parameters is slightly more flexible : you can match as both Foo (x, y) -> ...
or Foo p -> ...
, the latter not being available to multi-parameter constructors.
There is no asymmetry between tuples and records in that none of them has a special status in the sum type construction, which is only a sum of constructors of arbitrary arity. That said, it is easier to use tuples as parameter types for the constructors, as tuple types don't have to be declared to be used. Eg. some people have asked for the ability to write
type foo =
| Foo of { x : int; y : int }
| Bar of { z : foo list }
instead of the current
type foo = Foo of foo_t | Bar of bar_t
and foo_t = { x : int; y : int }
and bar_t = { z : foo list }
Your request is a particular (and not very interesting) case of this question. However, even with such shorthand syntax, there would still be one indirection pointer between the constructor and the data, making this style unattractive for performance-conscious programs -- what could be useful is the ability to have named constructor parameters.
PS: I'm not saying that Haskell's choice of desugaring records into tuples is a bad thing. By translating one feature into another, you reduce redundancy/overlapping of concepts. That said, I personally think it would be more natural to desugar tuples into records (with numerical field names, as done in eg. Oz). In programming language design, there are often no "good" and "bad" choices, only different compromises.
You can't have it because the language doesn't support it. It would actually be an easy fit into the type system or the data representation, but it would be a small additional complication in the compiler, and it hasn't been done. So yes, you have to choose between naming the constructor and naming the arguments.
Note that record label names are tied to a particular type, so e.g. {initialState=q}
is a pattern of type ('q, 's) nfa
; you can't usefully reuse the label name in a different type. So naming the constructor is only really useful when the type has multiple constructors; then, if you have a lot of arguments to a constructor, you may prefer to define a separate type for the arguments.
I believe there's a patch floating around for this feature, but I don't know if it's up-to-date for the latest OCaml version or anything, and it would require anyone using your code to have that patch.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With