I am currently trying to wrap my head around typeclasses and instances and I don't quite understand the point of them yet. I have two questions on the matter so far:
1) Why is it necessary to have a type class in a function signature when the function uses some function from that type class. Example:
f :: (Eq a) => a -> a -> Bool
f a b = a == b
Why put (Eq a)
in the signature. If ==
is not defined for a
then why not just throw the error when encountering a == b
? What is the point in having to declare the type class ahead?
2) How are type classes and function overloading related?
It is not possible to do this:
data A = A
data B = B
f :: A -> A
f a = a
f :: B -> B
f b = b
But it is possible to do this:
data A = A
data B = B
class F a where
f :: a -> a
instance F A where
f a = a
instance F B where
f b = b
What is up with that? Why can't I have two functions with the same name but operating on different types... Coming from C++ I find that very strange. But I probably have wrong conceptions about what these things really are. but once I wrap them in these type class instance thingies I can.
Feel free to hurl category or type theoretical words at me as well, as I am learning about these subjects in parallel to learning Haskell and I suspect there is a theoretical basis in these for how Haskell does things here.
A typeclass is a sort of interface that defines some behavior. If a type is a part of a typeclass, that means that it supports and implements the behavior the typeclass describes. A lot of people coming from OOP get confused by typeclasses because they think they are like classes in object oriented languages.
There are mainly two types of overloading, i.e. function overloading and operator overloading. Function overloading improves the code readability, thus keeping the same name for the same action. Operator overloading allows redefining the existing functionality of operators, thus by giving special meaning to them.
Function overloading means one function can perform many tasks. In C++, a single function is used to perform many tasks with the same name and different types of arguments. In the function overloading function will call at the time of program compilation. It is an example of compile-time polymorphism.
An overloaded function is really just a set of different functions that happen to have the same name. The determination of which function to use for a particular call is resolved at compile time. In Java, function overloading is also known as compile-time polymorphism and static polymorphism.
I agree with much of Willem Van Onsem’s answer, but I think it overlooks one of the principal advantages of typeclasses over truly ad-hoc overloading: abstraction. Imagine we used ad-hoc overloading instead of typeclasses to define the Monad
operations:
-- Maybe
pure :: a -> Maybe a
pure = Just
(>>=) :: Maybe a -> (a -> Maybe b) -> Maybe b
Just x >>= f = f x
Nothing >>= _ = Nothing
-- Either
pure :: a -> Either e a
pure = Right
(>>=) :: Either e a -> (a -> Either e b) -> Either e b
Right x >>= f = f x
Left err >>= _ = Left err
Now, we know that every monad can be expressed in terms of pure
and >>=
, as above, but we also know that they can be equivalently expressed using fmap
, pure
, and join
. Therefore, we should be able to implement a join
function that works on any monad:
join x = x >>= id
However, now we have a problem. What is join
’s type?
Clearly, join
has to be polymorphic, since it works on any monad by design. But giving it the type signature forall m a. m (m a) -> m a
would obviously be wrong, since it doesn’t work for all types, only monadic ones. Therefore, we need something in our type that expresses the need for the existence of some operation (>>=) :: m a -> (a -> m b) -> m b
, which is exactly what the typeclass constraint provides.
Given this, it becomes clear that ad-hoc overloading makes it possible to overload names, but it is impossible to abstract over those overloaded names because there is no guarantee the different implementations are related in any way. You could define monads without typeclasses, but then you couldn’t define join
, when
, unless
, mapM
, sequence
, and all the other nice things that you get for free when you define just two operations.
Therefore, typeclasses are necessary in Haskell to enable code reuse and to avoid enormous amounts of duplication. But could you have both typeclass-style overloading and type-directed, ad-hoc name overloading? Yes, and in fact, Idris does. But Idris’s type inference is very different from Haskell’s, so it’s more feasible to support than it is in Haskell for many of the reasons in Willem’s answer.
In short: because that is how Haskell was designed.
Why put
(Eq a)
in the signature. If==
is not defined for a then why not just throw the error when encounteringa == b
?
Why do we put the types in the signature of a C++ program (and not just somewhere as an assertion in the body)? Because that is how C++ is designed. Typically a concept on what programming languages are built is "make explicit what needs to be explicit".
It is not said that a Haskell module is open-source. So that means we only have the signature available. It would thus mean that when we for instance write:
Prelude> foo A A
<interactive>:4:1: error:
• No instance for (Eq A) arising from a use of ‘foo’
• In the expression: foo A A
In an equation for ‘it’: it = foo A A
We would frequently write foo
here with types that have no Eq
typeclass. As a result, we would get a lot of errors that are only discovered at compile time (or if Haskell was a dynamic language, at runtime). The idea of putting Eq a
in the type signature is that we can look up the signature of foo
in advance, and thus ensure that the types are instance of the typeclass.
Note that you do not have to write type signatures yourself: Haskell can typically derive the signature of a function, but a signature should include all the necessary information to call and use a function effectively. By adding type constraints, we speed up development.
What is up with that? Why can't I have two functions with the same name but operating on different types.
Again: that is how Haskell is designed. Functions in functional programming languages are "first class citizens". It means these usually have a name and we want to avoid name clashes as much as possible. Just like classes in C++ typically have a unique name (except for namespaces).
Say you would define two different functions:
incr :: Int -> Int
incr = (+1)
incr :: Bool -> Bool
incr _ = True
bar = incr
Then which incr
would bar
have to select? Of course we can make the types explicit (i.e. incr :: Bool -> Bool
), but usually we want to avoid that work, since it introduces a lot of noise.
Another good reason why we do not do that, is because typically a typeclass is not just a collection of functions: it adds contracts to these functions. For instance the Monad
typeclass has to satisfy certain relations between the functions. For example (>>= return)
should be equivalent with id
. In other words, the typeclass:
class Monad m where
(>>=) :: m a -> (a -> m b) -> m b
return :: a -> m a
Does not describes two independent functions (>>=)
and return
: this is a set of functions. You have them both (usually with some contracts between the specific >>=
and return
), or none of these at all.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With