I find myself running into a problem commonly, when writing larger programs in Haskell. I find myself often wanting multiple distinct types that share an internal representation and several core operations.
There are two relatively obvious approaches to solving this problem.
One is using a type class and the GeneralizedNewtypeDeriving
extension. Put enough logic into a type class to support the shared operations that the use case desires. Create a type with the desired representation, and create an instance of the type class for that type. Then, for each use case, create wrappers for it with newtype, and derive the common class.
The other is to declare the type with a phantom type variable, and then use EmptyDataDecls
to create distinct types for each different use case.
My main concern is not mixing up values that share internal representation and operations, but have different meanings in my code. Both of those approaches solve that problem, but feel significantly clumsy. My second concern is reducing the amount of boilerplate required, and both approaches do well enough at that.
What are the advantages and disadvantages of each approach? Is there a technique that comes closer to doing what I want, providing type safety without boilerplate code?
Boilerplate code. The need for boilerplate can be reduced through high-level mechanisms such as metaprogramming (which has the computer automatically write the needed boilerplate code or insert it at compile time ), convention over configuration (which provides good default values, reducing the need to specify program details in every project)...
In computer programming, boilerplate code or just boilerplate are sections of code that have to be included in many places with little or no alteration.
Another example of a boilerplate is the fine print that appears on many contracts. This section is usually static, as is the case with many cell phone contracts.
Boilerplating in the Modern World. In contemporary times, the term boilerplate is widely applied in a variety of settings to refer to a standardized method, form or procedure. For example, computer programmers sometimes speak of using boilerplate code to write a new program because modern programs can consist of billions of lines of codes,...
There's another straightforward approach.
data MyGenType = Foo | Bar
op :: MyGenType -> MyGenType
op x = ...
op2 :: MyGenType -> MyGenType -> MyGenType
op2 x y = ...
newtype MySpecialType {unMySpecial :: MyGenType}
inMySpecial f = MySpecialType . f . unMySpecial
inMySpecial2 f x y = ...
somefun = ... inMySpecial op x ...
someOtherFun = ... inMySpecial2 op2 x y ...
Alternately,
newtype MySpecial a = MySpecial a
instance Functor MySpecial where...
instance Applicative MySpecial where...
somefun = ... fmap op x ...
someOtherFun = ... liftA2 op2 x y ...
I think these approaches are nicer if you want to use your general type "naked" with any frequency, and only sometimes want to tag it. If, on the other hand, you generally want to use it tagged, then the phantom type approach more directly expresses what you want.
I've benchmarked toy examples and not found a performance difference between the two approaches, but usage does typically differ a bit.
For instance, in some cases you have a generic type whose constructors are exposed and you want to use newtype
wrappers to indicate a more semantically specific type. Using newtype
s then leads to call sites like,
s1 = Specific1 $ General "Bob" 23
s2 = Specific2 $ General "Joe" 19
Where the fact that the internal representations are the same between the different specific newtypes is transparent.
The type tag approach almost always goes along with representation constructor hiding,
data General2 a = General2 String Int
and the use of smart constructors, leading to a data type definition and call sites like,
mkSpecific1 "Bob" 23
Part of the reason being that you want some syntactically light way of indicating which tag you want. If you didn't provide smart constructors, then client code would often pick up type annotations to narrow things down, e.g.,
myValue = General2 String Int :: General2 Specific1
Once you adopt smart constructors, you can easily add extra validation logic to catch misuses of the tag. A nice aspect of the phantom type approach is that pattern matching isn't changed at all for internal code that has access to the representation.
internalFun :: General2 a -> General2 a -> Int
internalFun (General2 _ age1) (General2 _ age2) = age1 + age2
Of course you can use the newtype
s with smart constructors and an internal class for accessing the shared representation, but I think a key decision point in this design space is whether you want to keep your representation constructors exposed. If the sharing of representation should be transparent, and client code should be free to use whatever tag it wishes with no extra validation, then newtype
wrappers with GeneralizedNewtypeDeriving
work fine. But if you are going to adopt smart constructors for working with opaque representations, then I usually prefer phantom types.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With