I was revisiting a piece of code I wrote to do combinatorial search a few months ago, and noticed that there was an alternative, simpler way to do something that I'd previously achieved with a type class.
Specifically, I previously had a type class for the type of search problems, which have an states of type s
, actions (operations on states) of type a
, an initial state, a way of getting a list of (action,state) pairs and a way of testing whether a state is a solution or not:
class Problem p s a where initial :: p s a -> s successor :: p s a -> s -> [(a,s)] goaltest :: p s a -> s -> Bool
This is somewhat unsatisfactory, as it requires the MultiParameterTypeClass extension, and generally needs FlexibleInstances and possibly TypeSynonymInstances when you want to make instances of this class. It also clutters up your function signatures, e.g.
pathToSolution :: Problem p => p s a -> [(a,s)]
I noticed today that I can get rid of the class entirely, and use a type instead, along the following lines
data Problem s a { initial :: s, successor :: s -> [(a,s)], goaltest :: s -> Bool }
This doesn't require any extensions, the function signatures look nicer:
pathToSolution :: Problem s a -> [(a,s)]
and, most importantly, I found that after refactoring my code to use this abstraction instead of a type class, I was left with 15-20% fewer lines than I had previously.
The biggest win was in code that created abstractions using the type class - previously I had to create new data structures that wrapped the old ones in a complicated way, and then make them into instances of the Problem
class (which required more language extensions) - lots of lines of code to do something relatively simple. After the refactor, I just had a couple of functions that did exactly what I wanted to.
I'm now looking through the rest of the code, trying to spot instances where I can replace type classes with types, and make more wins.
My question is: in what situation does will this refactoring not work? In what cases is it actually just better to use a type class rather than a data type, and how can you recognise those situations ahead of time, so you don't have to go through a costly refactoring?
An object's class defines how the object is implemented. The class defines object's internal state and the implementation of its operations. In contrast, an object's type only refers to its interface - a set of requests to which it can respond.
A typeclass is a sort of interface that defines some behavior. If a type is a part of a typeclass, that means that it supports and implements the behavior the typeclass describes. A lot of people coming from OOP get confused by typeclasses because they think they are like classes in object oriented languages.
When should you use types in TypeScript? Unlike classes, types do not express functionality or logic inside your application. It's best to use types when you want to describe some form of information. They can describe varying shapes of data, ranging from simple constructs like strings, arrays, and objects.
In computer science, a type class is a type system construct that supports ad hoc polymorphism. This is achieved by adding constraints to type variables in parametrically polymorphic types.
Consider a situation where both the type and class exist in the same program. The type can be an instance of the class, but that's rather trivial. More interesting is that you can write a function fromProblemClass :: (CProblem p s a) => p s a -> TProblem s a
.
The refactoring you performed is roughly equivalent to manually inlining fromProblemClass
everywhere you construct something used as a CProblem
instance, and making every function that accepts a CProblem
instance instead accept TProblem
.
Since the only interesting parts of this refactoring are the definition of TProblem
and the implementation of fromProblemClass
, if you can write a similar type and function for any other class, you can likewise refactor it to eliminate the class entirely.
Think about the implementation of fromProblemClass
. You'll essentially be partially applying each function of the class to a value of the instance type, and in the process eliminating any reference to the p
parameter (which is what the type replaces).
Any situation where refactoring away a type class is straightforward is going to follow a similar pattern.
Imagine a simplified version of Show
, with only the show
function defined. This permits the same refactoring, applying show
and replacing each instance with... a String
. Clearly we've lost something here--namely, the ability to work with the original types and convert them to a String
at various points. The value of Show
is that it's defined on a wide variety of unrelated types.
As a rule of thumb, if there are many different functions specific to the types which are instances of the class, and these are often used in the same code as the class functions, delaying the conversion is useful. If there's a sharp dividing line between code that treats the types individually and code that uses the class, conversion functions might be more appropriate with a type class being a minor syntactic convenience. If the types are used almost exclusively through the class functions, the type class is probably completely superfluous.
Incidentally, the refactoring here is similar to the difference between a class and interface in OO languages; similarly, the type classes where this refactoring is impossible are those which can't be expressed directly at all in many OO languages.
More to the point, some examples of things you can't translate easily, if at all, in this manner:
The class's type parameter appearing only in covariant position, such as the result type of a function or as a non-function value. Notable offenders here are mempty
for Monoid
and return
for Monad
.
The class's type parameter appearing more than once in a function's type may not make this truly impossible but it complicates matters quite severely. Notable offenders here include Eq
, Ord
, and basically every numeric class.
Non-trivial use of higher kinds, the specifics of which I'm not sure how to pin down, but (>>=)
for Monad
is a notable offender here. On the other hand, the p
parameter in your class is not an issue.
Non-trivial use of multi-parameter type classes, which I'm also uncertain how to pin down and gets horrendously complicated in practice anyway, being comparable to multiple dispatch in OO languages. Again, your class doesn't have an issue here.
Note that, given the above, this refactoring is not even possible for many of the standard type classes, and would be counterproductive for the few exceptions. This is not a coincidence. :]
You give up the ability to distinguish between the original types. This sounds obvious, but it's potentially significant--if there are any situations where you really need to control which of the original class instance types was used, applying this refactoring loses some degree of type safety, which you can only recover by jumping through the same sort of hoops used elsewhere to ensure invariants at run-time.
Conversely, if there are situations where you really need to make the various instance types interchangeable--the convoluted wrapping you mentioned being a classic symptom of this--you gain a great deal by throwing away the original types. This is most often the case where you don't actually care much about the original data itself, but rather about how it lets you operate on other data; thus using records of functions directly is more natural than an extra layer of indirection.
As noted above, this relates closely to OOP and the type of problems it's best suited to, as well as representing the "other side" of the Expression Problem from what's typical in ML-style languages.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With