I find myself running into a problem commonly, when writing larger programs in Haskell. I find myself often wanting multiple distinct types that share an internal representation and several core operations. There are two relatively obvious approaches to solving this problem. One is using a type class and the <code>GeneralizedNewtypeDeriving</code> extension. Put enough logic into a type class to support the shared operations that the use case desires. Create a type with the desired representation, and create an instance of the type class for that type. Then, for each use case, create wrappers for it with newtype, and derive the common class. The other is to declare the type with a phantom type variable, and then use <code>EmptyDataDecls</code> to create distinct types for each different use case. My main concern is not mixing up values that share internal representation and operations, but have different meanings in my code. Both of those approaches solve that problem, but feel significantly clumsy. My second concern is reducing the amount of boilerplate required, and both approaches do well enough at that. What are the advantages and disadvantages of each approach? Is there a technique that comes closer to doing what I want, providing type safety without boilerplate code?

There's another straightforward approach. <pre class="prettyprint"><code>data MyGenType = Foo | Bar op :: MyGenType -> MyGenType op x = ... op2 :: MyGenType -> MyGenType -> MyGenType op2 x y = ... newtype MySpecialType {unMySpecial :: MyGenType} inMySpecial f = MySpecialType . f . unMySpecial inMySpecial2 f x y = ... somefun = ... inMySpecial op x ... someOtherFun = ... inMySpecial2 op2 x y ... </code></pre> Alternately, <pre class="prettyprint"><code>newtype MySpecial a = MySpecial a instance Functor MySpecial where... instance Applicative MySpecial where... somefun = ... fmap op x ... someOtherFun = ... liftA2 op2 x y ... </code></pre> I think these approaches are nicer if you want to use your general type "naked" with any frequency, and only sometimes want to tag it. If, on the other hand, you generally want to use it tagged, then the phantom type approach more directly expresses what you want.

I've benchmarked toy examples and not found a performance difference between the two approaches, but usage does typically differ a bit. For instance, in some cases you have a generic type whose constructors are exposed and you want to use <code>newtype</code> wrappers to indicate a more semantically specific type. Using <code>newtype</code>s then leads to call sites like, <pre class="prettyprint"><code>s1 = Specific1 $ General "Bob" 23 s2 = Specific2 $ General "Joe" 19 </code></pre> Where the fact that the internal representations are the same between the different specific newtypes is transparent. The type tag approach almost always goes along with representation constructor hiding, <pre class="prettyprint"><code>data General2 a = General2 String Int </code></pre> and the use of smart constructors, leading to a data type definition and call sites like, <pre class="prettyprint"><code>mkSpecific1 "Bob" 23 </code></pre> Part of the reason being that you want some syntactically light way of indicating which tag you want. If you didn't provide smart constructors, then client code would often pick up type annotations to narrow things down, e.g., <pre class="prettyprint"><code>myValue = General2 String Int :: General2 Specific1 </code></pre> Once you adopt smart constructors, you can easily add extra validation logic to catch misuses of the tag. A nice aspect of the phantom type approach is that pattern matching isn't changed at all for internal code that has access to the representation. <pre class="prettyprint"><code>internalFun :: General2 a -> General2 a -> Int internalFun (General2 _ age1) (General2 _ age2) = age1 + age2 </code></pre> Of course you can use the <code>newtype</code>s with smart constructors and an internal class for accessing the shared representation, but I think a key decision point in this design space is whether you want to keep your representation constructors exposed. If the sharing of representation should be transparent, and client code should be free to use whatever tag it wishes with no extra validation, then <code>newtype</code> wrappers with <code>GeneralizedNewtypeDeriving</code> work fine. But if you are going to adopt smart constructors for working with opaque representations, then I usually prefer phantom types.

Handling multiple types with the same internal representation and minimal boilerplate?

I find myself running into a problem commonly, when writing larger programs in Haskell. I find myself often wanting multiple distinct types that share an internal representation and several core operations.

There are two relatively obvious approaches to solving this problem.

One is using a type class and the GeneralizedNewtypeDeriving extension. Put enough logic into a type class to support the shared operations that the use case desires. Create a type with the desired representation, and create an instance of the type class for that type. Then, for each use case, create wrappers for it with newtype, and derive the common class.

The other is to declare the type with a phantom type variable, and then use EmptyDataDecls to create distinct types for each different use case.

My main concern is not mixing up values that share internal representation and operations, but have different meanings in my code. Both of those approaches solve that problem, but feel significantly clumsy. My second concern is reducing the amount of boilerplate required, and both approaches do well enough at that.

What are the advantages and disadvantages of each approach? Is there a technique that comes closer to doing what I want, providing type safety without boilerplate code?

How to reduce the need for boilerplate?

Boilerplate code. The need for boilerplate can be reduced through high-level mechanisms such as metaprogramming (which has the computer automatically write the needed boilerplate code or insert it at compile time ), convention over configuration (which provides good default values, reducing the need to specify program details in every project)...

What is a boilerplate code?

In computer programming, boilerplate code or just boilerplate are sections of code that have to be included in many places with little or no alteration.

What are some examples of boiler plates?

Another example of a boilerplate is the fine print that appears on many contracts. This section is usually static, as is the case with many cell phone contracts.

What is boilerplating in the modern world?

Boilerplating in the Modern World. In contemporary times, the term boilerplate is widely applied in a variety of settings to refer to a standardized method, form or procedure. For example, computer programmers sometimes speak of using boilerplate code to write a new program because modern programs can consist of billions of lines of codes,...

There's another straightforward approach.

data MyGenType = Foo | Bar

op :: MyGenType -> MyGenType
op x = ...

op2 :: MyGenType -> MyGenType -> MyGenType
op2 x y = ...

newtype MySpecialType {unMySpecial :: MyGenType}

inMySpecial f = MySpecialType . f . unMySpecial
inMySpecial2 f x y = ...

somefun = ... inMySpecial op x ...
someOtherFun = ... inMySpecial2 op2 x y ...

Alternately,

newtype MySpecial a = MySpecial a
instance Functor MySpecial where...
instance Applicative MySpecial where...

somefun = ... fmap op x ...
someOtherFun = ... liftA2 op2 x y ...

I think these approaches are nicer if you want to use your general type "naked" with any frequency, and only sometimes want to tag it. If, on the other hand, you generally want to use it tagged, then the phantom type approach more directly expresses what you want.

I've benchmarked toy examples and not found a performance difference between the two approaches, but usage does typically differ a bit.

For instance, in some cases you have a generic type whose constructors are exposed and you want to use newtype wrappers to indicate a more semantically specific type. Using newtypes then leads to call sites like,

s1 = Specific1 $ General "Bob" 23
s2 = Specific2 $ General "Joe" 19

Where the fact that the internal representations are the same between the different specific newtypes is transparent.

The type tag approach almost always goes along with representation constructor hiding,

data General2 a = General2 String Int

and the use of smart constructors, leading to a data type definition and call sites like,

mkSpecific1 "Bob" 23

Part of the reason being that you want some syntactically light way of indicating which tag you want. If you didn't provide smart constructors, then client code would often pick up type annotations to narrow things down, e.g.,

myValue = General2 String Int :: General2 Specific1

Once you adopt smart constructors, you can easily add extra validation logic to catch misuses of the tag. A nice aspect of the phantom type approach is that pattern matching isn't changed at all for internal code that has access to the representation.

internalFun :: General2 a -> General2 a -> Int
internalFun (General2 _ age1) (General2 _ age2) = age1 + age2

Of course you can use the newtypes with smart constructors and an internal class for accessing the shared representation, but I think a key decision point in this design space is whether you want to keep your representation constructors exposed. If the sharing of representation should be transparent, and client code should be free to use whatever tag it wishes with no extra validation, then newtype wrappers with GeneralizedNewtypeDeriving work fine. But if you are going to adopt smart constructors for working with opaque representations, then I usually prefer phantom types.

Handling multiple types with the same internal representation and minimal boilerplate?

Tags:

haskell

Carl

People also ask

2 Answers

sclv

Anthony

Recent Activity

Donate For Us

Handling multiple types with the same internal representation and minimal boilerplate?

Tags:

haskell

Carl

People also ask

2 Answers

sclv

Anthony

Related questions

Recent Activity

Donate For Us